Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions
Jeff Davis wrote: On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote: When you create a temporary table, PostgreSQL needs to add rows in pg_class, pg_attribute, and probably other system catalogs. So there are writes, which aren't possible in a read-only transaction. Hence the error. And no, there is no workaround. That sounds like a deficiency to overcome. It should be possible for those system catalogs to be virtual, defined like union views over similar immutable tables for the read-only database plus mutable in-memory ones for the temporary tables. Ideally, yes, from a logical standpoint there are catalog entries that are only interesting to one backend. But that doesn't mean it's easy to do. Remember that catalog lookups (even though most go through a cache) are a path that is important to performance. Also, more complex catalog interpretations may introduce some extra bootstrapping challenges. Are there any plans in the works to do this? I don't think so. It sounds like some fairly major work for a comparatively minor benefit. Suggestions welcome, of course, to either make the work look more minor or the benefits look more major ;) What I said before was a simplification; below I present my real proposal. I think an even better way to support this is would be based on Postgres having support for directly using multiple databases within the same SQL session at once, as if namespaces were another level deep, the first level being the databases, the second level the schemas, and the third level the schema objects. Kind of like what the SQL standard defines its catalog/schema/object namespaces. This instead of needing to use federating or that contrib module to use multiple Pg databases of the same cluster at once. Under this scenario, we make the property of a database being read-only or read-write for the current SQL session associated with a database rather than the whole SQL session. A given transaction can read from any database but can only make changes to the ones not read-only. Also, the proper way to do temporary tables would be to put them in another database than the main one, where the whole other database has the property of being temporary. Under this scenario, there would be separate system catalogs for each database, and so the ones for read-only databases are read-only, and the ones for other databases aren't. Then the system catalog itself fundamentally isn't more complicated, per database, and anything extra to handle cross-database queries or whatever, if anything, is a separate layer. Code that only deals with a single database at once would be an optimized situation and perform no worse than it does now. Furthermore, federating databases is done with the same interface, by adding remote/foreign databases as extra databases at the top level namespace. Fundamentally, a SQL session would be associated with a Pg server, not a database managed by such. When one starts a SQL session, there are initially no databases visible to them, and the top-level namespace is empty. They then mount a database, similarly to how one mounts an OS filesystem, by providing appropriate connection info, either just the database name or also user/pass or also remote host etc as is applicable, these details being the difference between using a local/same-Pg-cluster db or a remote/federated one, and the details also say whether it is temporary or initially read-only etc. See also how SQLite works; this mount being analogous to their attach. Such a paradigm is also how my Muldis D language interfaces databases; this is the most flexible, portable, extensible, optimizable, and elegant approach I can think of. -- Darren Duncan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Insufficient privileges.
Dave Coventry wrote: I am getting the following error message in my Drupal install. PDOException: SQLSTATE[42501]: Insufficient privilege: 7 ERROR: permission denied for sequence currenttest_id_seq: INSERT INTO currentTest (score) VALUES (:db_insert_placeholder_0); This is a table that I created using the postgres super user. I have tried to grant the drupal user (drupaluser) privileges to the table with: GRANT ALL ON currentTest to drupaluser; but this fails to resolve the issue. Can anyone suggest a way forward? GRANT USAGE on SEQUENCE currenttest_id_seq TO drupaluser; I think that it is a good idea to have different users for table creation and usage (if possible). But I would not use a superuser account. Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Streaming replication on 9.1-beta2 after pg_restore is very slow
Op 07-07-11 18:23, Tom Lane schreef: Karsten Hilbertkarsten.hilb...@gmx.net writes: On Thu, Jul 07, 2011 at 11:14:05AM -0400, Tom Lane wrote: I'm betting Debian hasn't fixed that bug yet either and so you need this post-beta2 patch: http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=c2ba0121c73b7461331104a46d140156e847572a Do they know about this gcc bug ? Can't say about Debian in particular, but upstream gcc certainly knows about it. https://bugzilla.redhat.com/show_bug.cgi?id=712480 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49390 I'm guessing this is probably the cause of my problem, then. I'll try to rebuild with an earlier version of GCC, and let you know the results. Thanks to all of you! David -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] DELETE taking too much memory
On Thu, 2011-07-07 at 15:34 +0200, vincent dephily wrote: Hi, I have a delete query taking 7.2G of ram (and counting) but I do not understant why so much memory is necessary. The server has 12G, and I'm afraid it'll go into swap. Using postgres 8.3.14. I'm purging some old data from table t1, which should cascade-delete referencing rows in t2. Here's an anonymized rundown : # explain delete from t1 where t1id in (select t1id from t2 where foo=0 and bar '20101101'); It looks as though you're hitting one of the known issues with PostgreSQL and FKs. The FK constraint checks and CASCADE actions are implemented using AFTER triggers, which are queued up during the query to be executed at the end. For very large queries, this queue of pending triggers can become very large, using up all available memory. There's a TODO item to try to fix this for a future version of PostgreSQL (maybe I'll have another go at it for 9.2), but at the moment all versions of PostgreSQL suffer from this problem. The simplest work-around for you might be to break your deletes up into smaller chunks, say 100k or 1M rows at a time, eg: delete from t1 where t1id in (select t1id from t2 where foo=0 and bar '20101101' limit 10); Regards, Dean -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] DELETE taking too much memory
On Thursday 07 July 2011 22:26:45 Guillaume Lelarge wrote: On Thu, 2011-07-07 at 15:34 +0200, vincent dephily wrote: Hi, I have a delete query taking 7.2G of ram (and counting) but I do not understant why so much memory is necessary. The server has 12G, and I'm afraid it'll go into swap. Using postgres 8.3.14. I'm purging some old data from table t1, which should cascade-delete referencing rows in t2. Here's an anonymized rundown : # \d t1 Table public.t1 Column |Type | Modifiers ---+-+-- --- t1id | integer | not null default nextval('t1_t1id_seq'::regclass) (...snip...) Indexes: message_pkey PRIMARY KEY, btree (id) (...snip...) # \d t2 Table public.t 2 Column |Type |Modifiers -+-+ - t2id| integer | not null default nextval('t2_t2id_seq'::regclass) t1id| integer | not null foo | integer | not null bar | timestamp without time zone | not null default now() Indexes: t2_pkey PRIMARY KEY, btree (t2id) t2_bar_key btree (bar) t2_t1id_key btree (t1id) Foreign-key constraints: t2_t1id_fkey FOREIGN KEY (t1id) REFERENCES t1(t1id) ON UPDATE RESTRICT ON DELETE CASCADE # explain delete from t1 where t1id in (select t1id from t2 where foo=0 and bar '20101101'); QUERY PLAN - Nested Loop (cost=5088742.39..6705282.32 rows=30849 width=6) - HashAggregate (cost=5088742.39..5089050.88 rows=30849 width=4) - Index Scan using t2_bar_key on t2 (cost=0.00..5035501.50 rows=21296354 width=4) Index Cond: (bar '2010-11-01 00:00:00'::timestamp without time zone) Filter: (foo = 0) - Index Scan using t1_pkey on t1 (cost=0.00..52.38 rows=1 width=10) Index Cond: (t1.t1id = t2.t1id) (7 rows) Note that the estimate of 30849 rows is way off : there should be around 55M rows deleted from t1, and 2-3 times as much from t2. When looking at the plan, I can easily imagine that data gets accumulated below the nestedloop (thus using all that memory), but why isn't each entry freed once one row has been deleted from t1 ? That entry isn't going to be found again in t1 or in t2, so why keep it around ? Is there a better way to write this query ? Would postgres 8.4/9.0 handle things better ? Do you have any DELETE triggers in t1 and/or t2? No, there are triggers on insert/update to t1 which both insert into t2, but no delete trigger. Deletions do cascade from t1 to t2 because of the foreign key. -- Vincent de Phily Mobile Devices +33 (0) 142 119 325 +353 (0) 85 710 6320 Warning This message (and any associated files) is intended only for the use of its intended recipient and may contain information that is confidential, subject to copyright or constitutes a trade secret. If you are not the intended recipient you are hereby notified that any dissemination, copying or distribution of this message, or files associated with this message, is strictly prohibited. If you have received this message in error, please notify us immediately by replying to the message and deleting it from your computer. Any views or opinions presented are solely those of the author vincent.deph...@mobile-devices.fr and do not necessarily represent those of the company. Although the company has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [PERFORM] DELETE taking too much memory
On Thursday 07 July 2011 19:54:08 French, Martin wrote: How up to date are the statistics for the tables in question? What value do you have for effective cache size? My guess would be that planner thinks the method it is using is right either for its current row number estimations, or the amount of memory it thinks it has to play with. Not very up to date I'm afraid (as shown by the low estimate of deleted rows). Table t2 has been insert-only since its re-creation (that's another story), while t1 is your classic insert-many, update-recent. We haven't tweaked effective cache size yet, it's on the TODO... like many other things :/ -- Vincent de Phily -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] DELETE taking too much memory
On Friday 08 July 2011 10:05:47 Dean Rasheed wrote: On Thu, 2011-07-07 at 15:34 +0200, vincent dephily wrote: Hi, I have a delete query taking 7.2G of ram (and counting) but I do not understant why so much memory is necessary. The server has 12G, and I'm afraid it'll go into swap. Using postgres 8.3.14. I'm purging some old data from table t1, which should cascade-delete referencing rows in t2. Here's an anonymized rundown : # explain delete from t1 where t1id in (select t1id from t2 where foo=0 and bar '20101101'); It looks as though you're hitting one of the known issues with PostgreSQL and FKs. The FK constraint checks and CASCADE actions are implemented using AFTER triggers, which are queued up during the query to be executed at the end. For very large queries, this queue of pending triggers can become very large, using up all available memory. There's a TODO item to try to fix this for a future version of PostgreSQL (maybe I'll have another go at it for 9.2), but at the moment all versions of PostgreSQL suffer from this problem. That's very interesting, and a more plausible not-optimized-yet item than my guesses so far, thanks. Drop me a mail if you work on this, and I'll find some time to test your code. I'm wondering though : this sounds like the behaviour of a deferrable fkey, which AFAICS is not the default and not my case ? I haven't explored that area of constraints yet, so there's certainly some detail that I'm missing. The simplest work-around for you might be to break your deletes up into smaller chunks, say 100k or 1M rows at a time, eg: delete from t1 where t1id in (select t1id from t2 where foo=0 and bar '20101101' limit 10); Yes, that's what we ended up doing. We canceled the query after 24h, shortly before the OOM killer would have, and started doing things in smaller batches. -- Vincent de Phily -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [PERFORM] DELETE taking too much memory
On Friday 08 July 2011 10:31:33 French, Martin wrote: If the query planner thinks it has the default amount of memory (128MB) and the stats are out of date, then it will by no means be able to plan proper execution. I would recommend setting the effective_cache_size to an appropriate value, running analyze on both tables with an appropriate stats target, and then explaining the query again to see if it's more accurate. Yes, I'll schedule those two to run during the night and repost an explain, for information. However, we worked around the initial problem by running the delete in smaller batches. Thanks. -- Vincent de Phily -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Streaming replication on 9.1-beta2 after pg_restore is very slow
Op 08-07-11 10:19, David Hartveld schreef: Op 07-07-11 18:23, Tom Lane schreef: Karsten Hilbertkarsten.hilb...@gmx.net writes: On Thu, Jul 07, 2011 at 11:14:05AM -0400, Tom Lane wrote: I'm betting Debian hasn't fixed that bug yet either and so you need this post-beta2 patch: http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=c2ba0121c73b7461331104a46d140156e847572a Do they know about this gcc bug ? Can't say about Debian in particular, but upstream gcc certainly knows about it. https://bugzilla.redhat.com/show_bug.cgi?id=712480 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49390 I'm guessing this is probably the cause of my problem, then. I'll try to rebuild with an earlier version of GCC, and let you know the results. I've rebuilt with debian gcc 4.4.5-8 and am running 9.1 now without any of the above problems. I've reported a bug for debian postgresql-9.1. I've also seen that the current gcc-4.6 version for debian is 4.6.1, so that shouldn't create any problems anymore (or at least the gcc bug is marked as applied in 4.6.1 in the gcc bugtracker). Thanks for all your time, guys! Greetings, David -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] DELETE taking too much memory
On 8 July 2011 10:44, Vincent de Phily vincent.deph...@mobile-devices.fr wrote: On Friday 08 July 2011 10:05:47 Dean Rasheed wrote: On Thu, 2011-07-07 at 15:34 +0200, vincent dephily wrote: Hi, I have a delete query taking 7.2G of ram (and counting) but I do not understant why so much memory is necessary. The server has 12G, and I'm afraid it'll go into swap. Using postgres 8.3.14. I'm purging some old data from table t1, which should cascade-delete referencing rows in t2. Here's an anonymized rundown : # explain delete from t1 where t1id in (select t1id from t2 where foo=0 and bar '20101101'); It looks as though you're hitting one of the known issues with PostgreSQL and FKs. The FK constraint checks and CASCADE actions are implemented using AFTER triggers, which are queued up during the query to be executed at the end. For very large queries, this queue of pending triggers can become very large, using up all available memory. There's a TODO item to try to fix this for a future version of PostgreSQL (maybe I'll have another go at it for 9.2), but at the moment all versions of PostgreSQL suffer from this problem. That's very interesting, and a more plausible not-optimized-yet item than my guesses so far, thanks. Drop me a mail if you work on this, and I'll find some time to test your code. I'm wondering though : this sounds like the behaviour of a deferrable fkey, which AFAICS is not the default and not my case ? I haven't explored that area of constraints yet, so there's certainly some detail that I'm missing. Yes, it's the same issue that affects deferrable PK and FK constraints, but even non-deferrable FKs use AFTER ROW triggers that suffer from this problem. These triggers don't show up in a \d from psql, but they are there (try select * from pg_trigger where tgconstrrelid = 't1'::regclass) and because they fire AFTER rather than BEFORE, queuing up large numbers of them is a problem. Regards, Dean The simplest work-around for you might be to break your deletes up into smaller chunks, say 100k or 1M rows at a time, eg: delete from t1 where t1id in (select t1id from t2 where foo=0 and bar '20101101' limit 10); Yes, that's what we ended up doing. We canceled the query after 24h, shortly before the OOM killer would have, and started doing things in smaller batches. -- Vincent de Phily -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [PERFORM] [GENERAL] DELETE taking too much memory
On Fri, Jul 8, 2011 at 12:48 PM, Dean Rasheed dean.a.rash...@gmail.com wrote: Yes, it's the same issue that affects deferrable PK and FK constraints, but even non-deferrable FKs use AFTER ROW triggers that suffer from this problem. These triggers don't show up in a \d from psql, but they are there (try select * from pg_trigger where tgconstrrelid = 't1'::regclass) and because they fire AFTER rather than BEFORE, queuing up large numbers of them is a problem. I would imagine an easy solution would be to compress the queue by inserting a single element representing all rows of row version id X. Ie: a delete or update will need to check all the row versions it creates with its txid, this txid could be used to represent the rows that need checking afterwards right? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [PERFORM] [GENERAL] DELETE taking too much memory
On Fri, Jul 8, 2011 at 4:35 AM, Dean Rasheed dean.a.rash...@gmail.comwrote: On Thu, 2011-07-07 at 15:34 +0200, vincent dephily wrote: Hi, I have a delete query taking 7.2G of ram (and counting) but I do not understant why so much memory is necessary. The server has 12G, and I'm afraid it'll go into swap. Using postgres 8.3.14. I'm purging some old data from table t1, which should cascade-delete referencing rows in t2. Here's an anonymized rundown : # explain delete from t1 where t1id in (select t1id from t2 where foo=0 and bar '20101101'); It looks as though you're hitting one of the known issues with PostgreSQL and FKs. The FK constraint checks and CASCADE actions are implemented using AFTER triggers, which are queued up during the query to be executed at the end. For very large queries, this queue of pending triggers can become very large, using up all available memory. There's a TODO item to try to fix this for a future version of PostgreSQL (maybe I'll have another go at it for 9.2), but at the moment all versions of PostgreSQL suffer from this problem. The simplest work-around for you might be to break your deletes up into smaller chunks, say 100k or 1M rows at a time, eg: delete from t1 where t1id in (select t1id from t2 where foo=0 and bar '20101101' limit 10); I'd like to comment here I had serious performance issues with a similar query (planner did horrible things), not sure if planner will do the same dumb thing it did for me, my query was against the same table (ie, t1=t2). I had this query: delete from t1 where ctid in (select ctid from t1 where created_at'20101231' limit 1); --- this was slow. Changed to: delete from t1 where ctid = any(array(select ctid from t1 where created_at'20101231' limit 1)); --- a lot faster. So... will the same principle work here?, doing this?: delete from t1 where t1id = any(array(select t1id from t2 where foo=0 and bar '20101101' limit 10)); -- would this query be faster then original one? Regards, Dean -- Sent via pgsql-performance mailing list (pgsql-performa...@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions
On Thu, 2011-07-07 at 23:21 -0700, Darren Duncan wrote: I think an even better way to support this is would be based on Postgres having support for directly using multiple databases within the same SQL session at once, as if namespaces were another level deep, the first level being the databases, the second level the schemas, and the third level the schema objects. Kind of like what the SQL standard defines its catalog/schema/object namespaces. This instead of needing to use federating or that contrib module to use multiple Pg databases of the same cluster at once. Under this scenario, we make the property of a database being read-only or read-write for the current SQL session associated with a database rather than the whole SQL session. A given transaction can read from any database but can only make changes to the ones not read-only. Also, the proper way to do temporary tables would be to put them in another database than the main one, where the whole other database has the property of being temporary. Under this scenario, there would be separate system catalogs for each database, and so the ones for read-only databases are read-only, and the ones for other databases aren't. Then the system catalog itself fundamentally isn't more complicated, per database, and anything extra to handle cross-database queries or whatever, if anything, is a separate layer. Code that only deals with a single database at once would be an optimized situation and perform no worse than it does now. One challenge that jumps to mind here is that an Oid would need to become a pair (catalog, oid). Even if the end result isn't much more complex, getting there is not trivial. See also how SQLite works; this mount being analogous to their attach. I'm not sure SQLite is the best example. It has a radically different architecture. Regards, Jeff Davis -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [PERFORM] DELETE taking too much memory
If the query planner thinks it has the default amount of memory (128MB) and the stats are out of date, then it will by no means be able to plan proper execution. I would recommend setting the effective_cache_size to an appropriate value, running analyze on both tables with an appropriate stats target, and then explaining the query again to see if it's more accurate. Cheers -Original Message- From: Vincent de Phily [mailto:vincent.deph...@mobile-devices.fr] Sent: 08 July 2011 10:20 To: French, Martin Cc: pgsql-general@postgresql.org; pgsql-performa...@postgresql.org Subject: Re: [PERFORM] DELETE taking too much memory On Thursday 07 July 2011 19:54:08 French, Martin wrote: How up to date are the statistics for the tables in question? What value do you have for effective cache size? My guess would be that planner thinks the method it is using is right either for its current row number estimations, or the amount of memory it thinks it has to play with. Not very up to date I'm afraid (as shown by the low estimate of deleted rows). Table t2 has been insert-only since its re-creation (that's another story), while t1 is your classic insert-many, update-recent. We haven't tweaked effective cache size yet, it's on the TODO... like many other things :/ -- Vincent de Phily ___ This email is intended for the named recipient. The information contained in it is confidential. You should not copy it for any purposes, nor disclose its contents to any other party. If you received this email in error, please notify the sender immediately via email, and delete it from your computer. Any views or opinions presented are solely those of the author and do not necessarily represent those of the company. PCI Compliancy: Please note, we do not send or wish to receive banking, credit or debit card information by email or any other form of communication. Please try our new on-line ordering system at http://www.cromwell.co.uk/ice Cromwell Tools Limited, PO Box 14, 65 Chartwell Drive Wigston, Leicester LE18 1AT. Tel 0116 2888000 Registered in England and Wales, Reg No 00986161 VAT GB 115 5713 87 900 __ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions
I like Darren's proposal. It is elegant. Date: Fri, 8 Jul 2011 18:38:59 +1200 From: gavinflo...@archidevsys.co.nz To: dar...@darrenduncan.net CC: pg...@j-davis.com; guilla...@lelarge.info; mbee...@hotmail.com; pgsql-general@postgresql.org; pgsql-hack...@postgresql.org Subject: Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions On 08/07/11 18:21, Darren Duncan wrote: Jeff Davis wrote: On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote: When you create a temporary table, PostgreSQL needs to add rows in pg_class, pg_attribute, and probably other system catalogs. So there are writes, which aren't possible in a read-only transaction. Hence the error. And no, there is no workaround. That sounds like a deficiency to overcome. It should be possible for those system catalogs to be virtual, defined like union views over similar immutable tables for the read-only database plus mutable in-memory ones for the temporary tables. Ideally, yes, from a logical standpoint there are catalog entries that are only interesting to one backend. But that doesn't mean it's easy to do. Remember that catalog lookups (even though most go through a cache) are a path that is important to performance. Also, more complex catalog interpretations may introduce some extra bootstrapping challenges. Are there any plans in the works to do this? I don't think so. It sounds like some fairly major work for a comparatively minor benefit. Suggestions welcome, of course, to either make the work look more minor or the benefits look more major ;) What I said before was a simplification; below I present my real proposal. I think an even better way to support this is would be based on Postgres having support for directly using multiple databases within the same SQL session at once, as if namespaces were another level deep, the first level being the databases, the second level the schemas, and the third level the schema objects. Kind of like what the SQL standard defines its catalog/schema/object namespaces. This instead of needing to use federating or that contrib module to use multiple Pg databases of the same cluster at once. Under this scenario, we make the property of a database being read-only or read-write for the current SQL session associated with a database rather than the whole SQL session. A given transaction can read from any database but can only make changes to the ones not read-only. Also, the proper way to do temporary tables would be to put them in another database than the main one, where the whole other database has the property of being temporary. Under this scenario, there would be separate system catalogs for each database, and so the ones for read-only databases are read-only, and the ones for other databases aren't. Then the system catalog itself fundamentally isn't more complicated, per database, and anything extra to handle cross-database queries or whatever, if anything, is a separate layer. Code that only deals with a single database at once would be an optimized situation and perform no worse than it does now. Furthermore, federating databases is done with the same interface, by adding remote/foreign databases as extra databases at the top level namespace. Fundamentally, a SQL session would be associated with a Pg server, not a database managed by such. When one starts a SQL session, there are initially no databases visible to them, and the top-level namespace is empty. They then mount a database, similarly to how one mounts an OS filesystem, by providing appropriate connection info, either just the database name or also user/pass or also remote host etc as is applicable, these details being the difference between using a local/same-Pg-cluster db or a remote/federated one, and the details also say whether it is temporary or initially read-only etc. See also how SQLite works; this mount being analogous to their attach. Such a paradigm is also how my Muldis D language interfaces databases; this is the most flexible, portable, extensible, optimizable, and elegant approach I can think of. -- Darren Duncan I would suggest that the default action for psql would be as now, associate the session with a database in the name of the current O/S user. However, use a new psql flag, such as '-unattached' or '-N', to indicate that no database is to be attached when psql starts up. While I don't have a current need for what you propose, it does look interesting and potentially useful to me.
Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions
On 08/07/11 18:21, Darren Duncan wrote: Jeff Davis wrote: On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote: When you create a temporary table, PostgreSQL needs to add rows in pg_class, pg_attribute, and probably other system catalogs. So there are writes, which aren't possible in a read-only transaction. Hence the error. And no, there is no workaround. That sounds like a deficiency to overcome. It should be possible for those system catalogs to be virtual, defined like union views over similar immutable tables for the read-only database plus mutable in-memory ones for the temporary tables. Ideally, yes, from a logical standpoint there are catalog entries that are only interesting to one backend. But that doesn't mean it's easy to do. Remember that catalog lookups (even though most go through a cache) are a path that is important to performance. Also, more complex catalog interpretations may introduce some extra bootstrapping challenges. Are there any plans in the works to do this? I don't think so. It sounds like some fairly major work for a comparatively minor benefit. Suggestions welcome, of course, to either make the work look more minor or the benefits look more major ;) What I said before was a simplification; below I present my real proposal. I think an even better way to support this is would be based on Postgres having support for directly using multiple databases within the same SQL session at once, as if namespaces were another level deep, the first level being the databases, the second level the schemas, and the third level the schema objects. Kind of like what the SQL standard defines its catalog/schema/object namespaces. This instead of needing to use federating or that contrib module to use multiple Pg databases of the same cluster at once. Under this scenario, we make the property of a database being read-only or read-write for the current SQL session associated with a database rather than the whole SQL session. A given transaction can read from any database but can only make changes to the ones not read-only. Also, the proper way to do temporary tables would be to put them in another database than the main one, where the whole other database has the property of being temporary. Under this scenario, there would be separate system catalogs for each database, and so the ones for read-only databases are read-only, and the ones for other databases aren't. Then the system catalog itself fundamentally isn't more complicated, per database, and anything extra to handle cross-database queries or whatever, if anything, is a separate layer. Code that only deals with a single database at once would be an optimized situation and perform no worse than it does now. Furthermore, federating databases is done with the same interface, by adding remote/foreign databases as extra databases at the top level namespace. Fundamentally, a SQL session would be associated with a Pg server, not a database managed by such. When one starts a SQL session, there are initially no databases visible to them, and the top-level namespace is empty. They then mount a database, similarly to how one mounts an OS filesystem, by providing appropriate connection info, either just the database name or also user/pass or also remote host etc as is applicable, these details being the difference between using a local/same-Pg-cluster db or a remote/federated one, and the details also say whether it is temporary or initially read-only etc. See also how SQLite works; this mount being analogous to their attach. Such a paradigm is also how my Muldis D language interfaces databases; this is the most flexible, portable, extensible, optimizable, and elegant approach I can think of. -- Darren Duncan I would suggest that the default action for psql would be as now, associate the session with a database in the name of the current O/S user. However, use a new psql flag, such as '-unattached' or '-N', to indicate that no database is to be attached when psql starts up. While I don't have a current need for what you propose, it does look interesting and potentially useful to me. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] 'libpq.lib' linking problem with VC++
My apologies if this is not the appropriate list for this, but the novice list has provided no response for weeks. I'm using Visual Studios C++ Express 2008 on Windows 7 with the standard (pre-packaged) 64-bit Windows distribution of PostgreSQL v.9.0.4 and I am having unresolved refs when linking to libpq.lib. The relevant elements of my program are: #include stdafx.h #include libpq-fe.h using namespace System; int main(arraySystem::String ^ ^args) { ... char *pghost = localhost, *pgport = 5432, *pgoptions = NULL, *pgtty = NULL; char *dbName = mydb; PGconn *conn; /* make a connection to the database */ conn = PQsetdbLogin(pghost, pgport, pgoptions, pgtty, dbName, NULL, NULL); ... ) Linking against libpq.lib in the postgres lib directory, the error I get is: DBTest.obj : error LNK2001: unresolved external symbol extern C struct pg_conn * __cdecl PQsetdbLogin( ... In the project properties I have included the path to the postgres lib directory and listed libpq.lib as a linker input. So, what am I missing? How do Iget this thing to link without error? Regards, - Roy =*=*= Always do right. This will gratify some people and astonish the rest. - Mark Twain The truth is rarely pure, and never simple. - Oscar Wilde
Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions
Jeff Davis wrote: On Thu, 2011-07-07 at 23:21 -0700, Darren Duncan wrote: I think an even better way to support this is would be based on Postgres having support for directly using multiple databases within the same SQL session at once, as if namespaces were another level deep, the first level being the databases, the second level the schemas, and the third level the schema objects. snip Then the system catalog itself fundamentally isn't more complicated, per database, and anything extra to handle cross-database queries or whatever, if anything, is a separate layer. Code that only deals with a single database at once would be an optimized situation and perform no worse than it does now. One challenge that jumps to mind here is that an Oid would need to become a pair (catalog, oid). Even if the end result isn't much more complex, getting there is not trivial. Yes, but that would just be in-memory or in temporary places external to every database. On disk internal to a database there would just be the oid. In fact, another aspect of the database model I defined is that each database is entirely self-contained; while you can do cross-database queries, you don't have cross-database constraints, in the general case. See also how SQLite works; this mount being analogous to their attach. I'm not sure SQLite is the best example. It has a radically different architecture. Still, its an example I know of where you can access several clearly separable databases at once through a common namespace. While one might argue this is a substitute for multiple schema support, I don't because with multiple schemas you can have integrity constraints that cross schemas. The namespaces issue is largely orthogonal to self-containment or integrity in my model. But look at Oracle too, at least how I understand it. Oracle supports CONNECT TO ... AUTHORIZE .../etc SQL, meaning you can define what databases you are accessing within the SQL session, rather than having to do it externally. I assume that Oracle's features correspond somewhat to my proposal, and so enable cross-database queries in the illusion that several databases are one. Suffice it to say, I have thought through my proposed model for years, with one of its (and Muldis D's) express purposes in providing a common normalized paradigm that all the existing SQL DBMSs can map to with consistent behavior whether Oracle or SQLite, and I haven't stated all of it here (a lot more is in my published language spec). Key mapping points are the boundaries of a database's self-definability. And namespace nesting is actually arbitrary-depth, so accounting for everything from no native schema support to schema plus package namespace support. -- Darren Duncan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] long names get truncated
Hello, I was bitten by a length-truncated role name used in a script since the truncation only raises a NOTICE. The symptom was that the some GRANTs ended up on the wrong objects after name truncation. Then, I experimented with tables with long names and was surprised by the truncation behavior: test=# create table longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglong(a int); NOTICE: identifier longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglong will be truncated to longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglon CREATE TABLE test=# \d longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglong Did not find any relation named longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglong. test=# drop table longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglongNOT; --SURPRISE! NOTICE: identifier longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglongnot will be truncated to longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglon DROP TABLE One really has to pay attention to the length limits (63 bytes): 1) Name truncation is not an error. 2) psql \d doesn't work with long names- perhaps the same auto-truncation rules should apply? 3) DROPping a non-existent table with a truncated identifier unintentionally drops the long name table. For those curious, I hit the limits prefixing roles with UUIDs for automated testing so that database-global objects can be deleted after the test. I wish there were a way to turn the truncation into an error. Is there some better way I could have caught this? Cheers, M -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions
On Fri, Jul 8, 2011 at 2:21 AM, Darren Duncan dar...@darrenduncan.net wrote: I think an even better way to support this is would be based on Postgres having support for directly using multiple databases within the same SQL session at once, as if namespaces were another level deep, the first level being the databases, the second level the schemas, and the third level the schema objects. Kind of like what the SQL standard defines its catalog/schema/object namespaces. This instead of needing to use federating or that contrib module to use multiple Pg databases of the same cluster at once. But if that's what you want, just don't put your data in different databases in the first place. That's what schemas are for. If for some reason we needed to have tables that happened to be called x.y.z and a.b.c accessible from a single SQL session, we could allow that much more simply by allowing schemas to be nested. Then we could allow arbitrary numbers of levels, not just three. The whole point of having databases and schemas as separate objects is that they do different things: schemas are just containers for names, allowing common access to data, and databases are completely separate entities, allowing privilege separation for (say) a multi-tenant hosting environment. We're not going to throw out the latter concept just so people can use two dots in their table names instead of one. Under this scenario, we make the property of a database being read-only or read-write for the current SQL session associated with a database rather than the whole SQL session. A given transaction can read from any database but can only make changes to the ones not read-only. Also, the proper way to do temporary tables would be to put them in another database than the main one, where the whole other database has the property of being temporary. Under this scenario, there would be separate system catalogs for each database, and so the ones for read-only databases are read-only, and the ones for other databases aren't. Then the system catalog itself fundamentally isn't more complicated, per database, and anything extra to handle cross-database queries or whatever, if anything, is a separate layer. Code that only deals with a single database at once would be an optimized situation and perform no worse than it does now. I think you should make more of an effort to understand how the system works now, and why, before proposing radical redesigns. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions
Robert Haas wrote: But if that's what you want, just don't put your data in different databases in the first place. That's what schemas are for. If for some reason we needed to have tables that happened to be called x.y.z and a.b.c accessible from a single SQL session, we could allow that much more simply by allowing schemas to be nested. Then we could allow arbitrary numbers of levels, not just three. The whole point of having databases and schemas as separate objects is that they do different things: schemas are just containers for names, allowing common access to data, and databases are completely separate entities, allowing privilege separation for (say) a multi-tenant hosting environment. We're not going to throw out the latter concept just so people can use two dots in their table names instead of one. I agree with what you're saying in general and that schema namespaces should be nestable to arbitrary levels. One dot or two isn't an issue I have. Dividing based on databases or on schemas is a big and important distinction. I see that the semantic purpose of using multiple databases is to allow things to be completely independent and self-defined, where one can understand the meaning of any one database in isolation. So one can take each of the 2 databases and walk off with them in opposite directions, and each can still be used and understood. Whereas, schemas are namespaces for organizing entities within a single database where any of those entities may be interdependent, such as defining a data type in one schema and using it as the declared type with a routine or table or constraint in another. But just because you use multiple databases in order for them to be independent, sometimes one still wants to use them together, and an abstraction loosely like federating is useful here. I think you should make more of an effort to understand how the system works now, and why, before proposing radical redesigns. Well yes, of course. But that will take time and I think I already understand enough about it to make some useful contributions in the meantime. How much or what I already know may not always come across well. If this bothers people then I can make more of an effort to reduce my input until I have more solid things to back them up. -- Darren Duncan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions
Robert Haas robertmh...@gmail.com writes: If for some reason we needed to have tables that happened to be called x.y.z and a.b.c accessible from a single SQL session, we could allow that much more simply by allowing schemas to be nested. Then we could allow arbitrary numbers of levels, not just three. FWIW, I actually tried to do that back when we first introduced schema support (the fact that the code calls them namespaces and not schemas is a leftover from that idea). It turns out to be a whole lot harder than it sounds, because of the ambiguity you get about which name goes at what level. A simple example of this is: if you write x.y in a query, is that meant to be table x's column y, or is it meant to be field y within a composite column x of some table in the query? We've resolved that by requiring you to write (x).y when you mean the latter, but it's not exactly an intuitive or pleasant answer. In the same way, if namespaces can be nested to different levels, it gets really messy to support abbreviations of any sort --- but the SQL spec requires us to be able to do so. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general