Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-08 Thread Darren Duncan

Jeff Davis wrote:

On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote:

When you create a temporary table, PostgreSQL needs to add rows in
pg_class, pg_attribute, and probably other system catalogs. So there are
writes, which aren't possible in a read-only transaction. Hence the
error. And no, there is no workaround.

That sounds like a deficiency to overcome.

It should be possible for those system catalogs to be virtual, defined like 
union views over similar immutable tables for the read-only database plus 
mutable in-memory ones for the temporary tables.


Ideally, yes, from a logical standpoint there are catalog entries that
are only interesting to one backend.

But that doesn't mean it's easy to do. Remember that catalog lookups
(even though most go through a cache) are a path that is important to
performance. Also, more complex catalog interpretations may introduce
some extra bootstrapping challenges.


Are there any plans in the works to do this?


I don't think so. It sounds like some fairly major work for a
comparatively minor benefit.

Suggestions welcome, of course, to either make the work look more minor
or the benefits look more major ;)


What I said before was a simplification; below I present my real proposal.

I think an even better way to support this is would be based on Postgres having 
support for directly using multiple databases within the same SQL session at 
once, as if namespaces were another level deep, the first level being the 
databases, the second level the schemas, and the third level the schema objects.


Kind of like what the SQL standard defines its catalog/schema/object namespaces.

This instead of needing to use federating or that contrib module to use multiple 
Pg databases of the same cluster at once.


Under this scenario, we make the property of a database being read-only or 
read-write for the current SQL session associated with a database rather than 
the whole SQL session.  A given transaction can read from any database but can 
only make changes to the ones not read-only.


Also, the proper way to do temporary tables would be to put them in another 
database than the main one, where the whole other database has the property of 
being temporary.


Under this scenario, there would be separate system catalogs for each database, 
and so the ones for read-only databases are read-only, and the ones for other 
databases aren't.


Then the system catalog itself fundamentally isn't more complicated, per 
database, and anything extra to handle cross-database queries or whatever, if 
anything, is a separate layer.  Code that only deals with a single database at 
once would be an optimized situation and perform no worse than it does now.


Furthermore, federating databases is done with the same interface, by adding 
remote/foreign databases as extra databases at the top level namespace.


Fundamentally, a SQL session would be associated with a Pg server, not a 
database managed by such.  When one starts a SQL session, there are initially no 
databases visible to them, and the top-level namespace is empty.


They then mount a database, similarly to how one mounts an OS filesystem, by 
providing appropriate connection info, either just the database name or also 
user/pass or also remote host etc as is applicable, these details being the 
difference between using a local/same-Pg-cluster db or a remote/federated one, 
and the details also say whether it is temporary or initially read-only etc.


See also how SQLite works; this mount being analogous to their attach.

Such a paradigm is also how my Muldis D language interfaces databases; this is 
the most flexible, portable, extensible, optimizable, and elegant approach I can 
think of.


-- Darren Duncan

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Insufficient privileges.

2011-07-08 Thread Albe Laurenz
Dave Coventry wrote:
 I am getting the following error message in my Drupal install.

 PDOException: SQLSTATE[42501]: Insufficient privilege: 7 ERROR:
 permission denied for sequence currenttest_id_seq: INSERT INTO
 currentTest (score) VALUES (:db_insert_placeholder_0);

 This is a table that I created using the postgres super user.

 I have tried to grant the drupal user (drupaluser) privileges
 to the table with:

 GRANT ALL ON currentTest to drupaluser;

 but this fails to resolve the issue.

 Can anyone suggest a way forward?

GRANT USAGE on SEQUENCE currenttest_id_seq TO drupaluser;

I think that it is a good idea to have different users for
table creation and usage (if possible).
But I would not use a superuser account.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Streaming replication on 9.1-beta2 after pg_restore is very slow

2011-07-08 Thread David Hartveld

Op 07-07-11 18:23, Tom Lane schreef:

Karsten Hilbertkarsten.hilb...@gmx.net  writes:

On Thu, Jul 07, 2011 at 11:14:05AM -0400, Tom Lane wrote:

I'm betting Debian hasn't fixed that bug yet either and so you need this
post-beta2 patch:

http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=c2ba0121c73b7461331104a46d140156e847572a



Do they know about this gcc bug ?


Can't say about Debian in particular, but upstream gcc certainly knows
about it.
https://bugzilla.redhat.com/show_bug.cgi?id=712480
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49390


I'm guessing this is probably the cause of my problem, then. I'll try to 
rebuild with an earlier version of GCC, and let you know the results.


Thanks to all of you!
David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] DELETE taking too much memory

2011-07-08 Thread Dean Rasheed
 On Thu, 2011-07-07 at 15:34 +0200, vincent dephily wrote:
 Hi,

 I have a delete query taking 7.2G of ram (and counting) but I do not
 understant why so much memory is necessary. The server has 12G, and
 I'm afraid it'll go into swap. Using postgres 8.3.14.

 I'm purging some old data from table t1, which should cascade-delete
 referencing rows in t2. Here's an anonymized rundown :

 # explain delete from t1 where t1id in (select t1id from t2 where
 foo=0 and bar  '20101101');

It looks as though you're hitting one of the known issues with
PostgreSQL and FKs. The FK constraint checks and CASCADE actions are
implemented using AFTER triggers, which are queued up during the query
to be executed at the end. For very large queries, this queue of
pending triggers can become very large, using up all available memory.

There's a TODO item to try to fix this for a future version of
PostgreSQL (maybe I'll have another go at it for 9.2), but at the
moment all versions of PostgreSQL suffer from this problem.

The simplest work-around for you might be to break your deletes up
into smaller chunks, say 100k or 1M rows at a time, eg:

delete from t1 where t1id in (select t1id from t2 where foo=0 and bar
 '20101101' limit 10);

Regards,
Dean

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] DELETE taking too much memory

2011-07-08 Thread Vincent de Phily
On Thursday 07 July 2011 22:26:45 Guillaume Lelarge wrote:
 On Thu, 2011-07-07 at 15:34 +0200, vincent dephily wrote:
  Hi,
  
  I have a delete query taking 7.2G of ram (and counting) but I do not
  understant why so much memory is necessary. The server has 12G, and
  I'm afraid it'll go into swap. Using postgres 8.3.14.
  
  I'm purging some old data from table t1, which should cascade-delete
  referencing rows in t2. Here's an anonymized rundown :
  
  
  # \d t1
  
   Table
   public.t1

Column   |Type | Modifiers
  
  ---+-+--
  ---
  
   t1id  | integer | not null default
  
  nextval('t1_t1id_seq'::regclass)
  (...snip...)
  
  Indexes:
  message_pkey PRIMARY KEY, btree (id)
  
  (...snip...)
  
  # \d t2
  
 Table
 public.t
 2
   
   Column  |Type |Modifiers
  
  -+-+
  -
  
   t2id| integer | not null default
  
  nextval('t2_t2id_seq'::regclass)
  
   t1id| integer | not null
   foo | integer | not null
   bar | timestamp without time zone | not null default now()
  
  Indexes:
  t2_pkey PRIMARY KEY, btree (t2id)
  t2_bar_key btree (bar)
  t2_t1id_key btree (t1id)
  
  Foreign-key constraints:
  t2_t1id_fkey FOREIGN KEY (t1id) REFERENCES t1(t1id) ON UPDATE
  
  RESTRICT ON DELETE CASCADE
  
  # explain delete from t1 where t1id in (select t1id from t2 where
  foo=0 and bar  '20101101');
  
 QUERY PLAN
  
  
  -
  
   Nested Loop  (cost=5088742.39..6705282.32 rows=30849 width=6)
   
 -  HashAggregate  (cost=5088742.39..5089050.88 rows=30849
 width=4)
 
   -  Index Scan using t2_bar_key on t2 
   (cost=0.00..5035501.50
  
  rows=21296354 width=4)
  
 Index Cond: (bar  '2010-11-01
 00:00:00'::timestamp
  
  without time zone)
  
 Filter: (foo = 0)
 
 -  Index Scan using t1_pkey on t1  (cost=0.00..52.38 rows=1
 width=10)
 
   Index Cond: (t1.t1id = t2.t1id)
  
  (7 rows)
  
  
  Note that the estimate of 30849 rows is way off : there should be
  around 55M rows deleted from t1, and 2-3 times as much from t2.
  
  When looking at the plan, I can easily imagine that data gets
  accumulated below the nestedloop (thus using all that memory), but why
  isn't each entry freed once one row has been deleted from t1 ? That
  entry isn't going to be found again in t1 or in t2, so why keep it
  around ?
  
  Is there a better way to write this query ? Would postgres 8.4/9.0
  handle things better ?
 
 Do you have any DELETE triggers in t1 and/or t2?

No, there are triggers on insert/update to t1 which both insert into t2, but 
no delete trigger. Deletions do cascade from t1 to t2 because of the foreign 
key.
-- 
Vincent de Phily
Mobile Devices
+33 (0) 142 119 325
+353 (0) 85 710 6320 

Warning
This message (and any associated files) is intended only for the use of its
intended recipient and may contain information that is confidential, subject
to copyright or constitutes a trade secret. If you are not the intended
recipient you are hereby notified that any dissemination, copying or
distribution of this message, or files associated with this message, is
strictly prohibited. If you have received this message in error, please
notify us immediately by replying to the message and deleting it from your
computer. Any views or opinions presented are solely those of the author
vincent.deph...@mobile-devices.fr and do not necessarily represent those of 
the
company. Although the company has taken reasonable precautions to ensure no
viruses are present in this email, the company cannot accept responsibility
for any loss or damage arising from the use of this email or attachments.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [PERFORM] DELETE taking too much memory

2011-07-08 Thread Vincent de Phily
On Thursday 07 July 2011 19:54:08 French, Martin wrote:
 How up to date are the statistics for the tables in question?
 
 What value do you have for effective cache size?
 
 My guess would be that planner thinks the method it is using is right
 either for its current row number estimations, or the amount of memory
 it thinks it has to play with.

Not very up to date I'm afraid (as shown by the low estimate of deleted rows). 
Table t2 has been insert-only since its re-creation (that's another story), 
while t1 is your classic insert-many, update-recent.

We haven't tweaked effective cache size yet, it's on the TODO... like many 
other things :/
-- 
Vincent de Phily

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] DELETE taking too much memory

2011-07-08 Thread Vincent de Phily
On Friday 08 July 2011 10:05:47 Dean Rasheed wrote:
  On Thu, 2011-07-07 at 15:34 +0200, vincent dephily wrote:
  Hi,
  
  I have a delete query taking 7.2G of ram (and counting) but I do not
  understant why so much memory is necessary. The server has 12G, and
  I'm afraid it'll go into swap. Using postgres 8.3.14.
  
  I'm purging some old data from table t1, which should cascade-delete
  referencing rows in t2. Here's an anonymized rundown :
  
  # explain delete from t1 where t1id in (select t1id from t2 where
  foo=0 and bar  '20101101');
 
 It looks as though you're hitting one of the known issues with
 PostgreSQL and FKs. The FK constraint checks and CASCADE actions are
 implemented using AFTER triggers, which are queued up during the query
 to be executed at the end. For very large queries, this queue of
 pending triggers can become very large, using up all available memory.
 
 There's a TODO item to try to fix this for a future version of
 PostgreSQL (maybe I'll have another go at it for 9.2), but at the
 moment all versions of PostgreSQL suffer from this problem.

That's very interesting, and a more plausible not-optimized-yet item than my 
guesses so far, thanks. Drop me a mail if you work on this, and I'll find some 
time to test your code.

I'm wondering though : this sounds like the behaviour of a deferrable fkey, 
which AFAICS is not the default and not my case ? I haven't explored that area 
of constraints yet, so there's certainly some detail that I'm missing.


 The simplest work-around for you might be to break your deletes up
 into smaller chunks, say 100k or 1M rows at a time, eg:
 
 delete from t1 where t1id in (select t1id from t2 where foo=0 and bar
  '20101101' limit 10);

Yes, that's what we ended up doing. We canceled the query after 24h, shortly 
before the OOM killer would have, and started doing things in smaller batches.


-- 
Vincent de Phily

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [PERFORM] DELETE taking too much memory

2011-07-08 Thread Vincent de Phily
On Friday 08 July 2011 10:31:33 French, Martin wrote:
 If the query planner thinks it has the default amount of memory (128MB)
 and the stats are out of date, then it will by no means be able to plan
 proper execution.
 
 I would recommend setting the effective_cache_size to an appropriate
 value, running analyze on both tables with an appropriate stats
 target, and then explaining the query again to see if it's more
 accurate.

Yes, I'll schedule those two to run during the night and repost an explain, 
for information. However, we worked around the initial problem by running the 
delete in smaller batches.

Thanks.
-- 
Vincent de Phily


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Streaming replication on 9.1-beta2 after pg_restore is very slow

2011-07-08 Thread David Hartveld

Op 08-07-11 10:19, David Hartveld schreef:

Op 07-07-11 18:23, Tom Lane schreef:

Karsten Hilbertkarsten.hilb...@gmx.net writes:

On Thu, Jul 07, 2011 at 11:14:05AM -0400, Tom Lane wrote:

I'm betting Debian hasn't fixed that bug yet either and so you need
this
post-beta2 patch:

http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=c2ba0121c73b7461331104a46d140156e847572a




Do they know about this gcc bug ?


Can't say about Debian in particular, but upstream gcc certainly knows
about it.
https://bugzilla.redhat.com/show_bug.cgi?id=712480
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49390


I'm guessing this is probably the cause of my problem, then. I'll try to
rebuild with an earlier version of GCC, and let you know the results.


I've rebuilt with debian gcc 4.4.5-8 and am running 9.1 now without any 
of the above problems. I've reported a bug for debian postgresql-9.1. 
I've also seen that the current gcc-4.6 version for debian is 4.6.1, so 
that shouldn't create any problems anymore (or at least the gcc bug is 
marked as applied in 4.6.1 in the gcc bugtracker).


Thanks for all your time, guys!
Greetings, David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] DELETE taking too much memory

2011-07-08 Thread Dean Rasheed
On 8 July 2011 10:44, Vincent de Phily
vincent.deph...@mobile-devices.fr wrote:
 On Friday 08 July 2011 10:05:47 Dean Rasheed wrote:
  On Thu, 2011-07-07 at 15:34 +0200, vincent dephily wrote:
  Hi,
 
  I have a delete query taking 7.2G of ram (and counting) but I do not
  understant why so much memory is necessary. The server has 12G, and
  I'm afraid it'll go into swap. Using postgres 8.3.14.
 
  I'm purging some old data from table t1, which should cascade-delete
  referencing rows in t2. Here's an anonymized rundown :
 
  # explain delete from t1 where t1id in (select t1id from t2 where
  foo=0 and bar  '20101101');

 It looks as though you're hitting one of the known issues with
 PostgreSQL and FKs. The FK constraint checks and CASCADE actions are
 implemented using AFTER triggers, which are queued up during the query
 to be executed at the end. For very large queries, this queue of
 pending triggers can become very large, using up all available memory.

 There's a TODO item to try to fix this for a future version of
 PostgreSQL (maybe I'll have another go at it for 9.2), but at the
 moment all versions of PostgreSQL suffer from this problem.

 That's very interesting, and a more plausible not-optimized-yet item than my
 guesses so far, thanks. Drop me a mail if you work on this, and I'll find some
 time to test your code.

 I'm wondering though : this sounds like the behaviour of a deferrable fkey,
 which AFAICS is not the default and not my case ? I haven't explored that area
 of constraints yet, so there's certainly some detail that I'm missing.


Yes, it's the same issue that affects deferrable PK and FK
constraints, but even non-deferrable FKs use AFTER ROW triggers that
suffer from this problem. These triggers don't show up in a \d from
psql, but they are there (try select * from pg_trigger where
tgconstrrelid = 't1'::regclass) and because they fire AFTER rather
than BEFORE, queuing up large numbers of them is a problem.

Regards,
Dean



 The simplest work-around for you might be to break your deletes up
 into smaller chunks, say 100k or 1M rows at a time, eg:

 delete from t1 where t1id in (select t1id from t2 where foo=0 and bar
  '20101101' limit 10);

 Yes, that's what we ended up doing. We canceled the query after 24h, shortly
 before the OOM killer would have, and started doing things in smaller batches.


 --
 Vincent de Phily


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [PERFORM] [GENERAL] DELETE taking too much memory

2011-07-08 Thread Claudio Freire
On Fri, Jul 8, 2011 at 12:48 PM, Dean Rasheed dean.a.rash...@gmail.com wrote:
 Yes, it's the same issue that affects deferrable PK and FK
 constraints, but even non-deferrable FKs use AFTER ROW triggers that
 suffer from this problem. These triggers don't show up in a \d from
 psql, but they are there (try select * from pg_trigger where
 tgconstrrelid = 't1'::regclass) and because they fire AFTER rather
 than BEFORE, queuing up large numbers of them is a problem.

I would imagine an easy solution would be to compress the queue by
inserting a single element representing all rows of row version id X.

Ie: a delete or update will need to check all the row versions it
creates with its txid, this txid could be used to represent the rows
that need checking afterwards right?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [PERFORM] [GENERAL] DELETE taking too much memory

2011-07-08 Thread Jose Ildefonso Camargo Tolosa
On Fri, Jul 8, 2011 at 4:35 AM, Dean Rasheed dean.a.rash...@gmail.comwrote:

  On Thu, 2011-07-07 at 15:34 +0200, vincent dephily wrote:
  Hi,
 
  I have a delete query taking 7.2G of ram (and counting) but I do not
  understant why so much memory is necessary. The server has 12G, and
  I'm afraid it'll go into swap. Using postgres 8.3.14.
 
  I'm purging some old data from table t1, which should cascade-delete
  referencing rows in t2. Here's an anonymized rundown :
 
  # explain delete from t1 where t1id in (select t1id from t2 where
  foo=0 and bar  '20101101');

 It looks as though you're hitting one of the known issues with
 PostgreSQL and FKs. The FK constraint checks and CASCADE actions are
 implemented using AFTER triggers, which are queued up during the query
 to be executed at the end. For very large queries, this queue of
 pending triggers can become very large, using up all available memory.

 There's a TODO item to try to fix this for a future version of
 PostgreSQL (maybe I'll have another go at it for 9.2), but at the
 moment all versions of PostgreSQL suffer from this problem.

 The simplest work-around for you might be to break your deletes up
 into smaller chunks, say 100k or 1M rows at a time, eg:

 delete from t1 where t1id in (select t1id from t2 where foo=0 and bar
  '20101101' limit 10);


I'd like to comment here I had serious performance issues with a similar
query (planner did horrible things), not sure if planner will do the same
dumb thing it did for me, my query was against the same table (ie, t1=t2).
I had this query:

delete from t1 where ctid in (select ctid from t1 where
created_at'20101231' limit 1);   --- this was slow.  Changed to:

delete from t1 where ctid = any(array(select ctid from t1 where
created_at'20101231' limit 1));   --- a lot faster.

So... will the same principle work here?, doing this?:

delete from t1 where t1id = any(array(select t1id from t2 where foo=0 and
bar
 '20101101' limit 10));  -- would this query be faster then original
one?




 Regards,
 Dean

 --
 Sent via pgsql-performance mailing list (pgsql-performa...@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-08 Thread Jeff Davis
On Thu, 2011-07-07 at 23:21 -0700, Darren Duncan wrote:
 I think an even better way to support this is would be based on Postgres 
 having 
 support for directly using multiple databases within the same SQL session at 
 once, as if namespaces were another level deep, the first level being the 
 databases, the second level the schemas, and the third level the schema 
 objects.
 
 Kind of like what the SQL standard defines its catalog/schema/object 
 namespaces.
 
 This instead of needing to use federating or that contrib module to use 
 multiple 
 Pg databases of the same cluster at once.
 
 Under this scenario, we make the property of a database being read-only or 
 read-write for the current SQL session associated with a database rather than 
 the whole SQL session.  A given transaction can read from any database but 
 can 
 only make changes to the ones not read-only.
 
 Also, the proper way to do temporary tables would be to put them in another 
 database than the main one, where the whole other database has the property 
 of 
 being temporary.
 
 Under this scenario, there would be separate system catalogs for each 
 database, 
 and so the ones for read-only databases are read-only, and the ones for other 
 databases aren't.
 
 Then the system catalog itself fundamentally isn't more complicated, per 
 database, and anything extra to handle cross-database queries or whatever, if 
 anything, is a separate layer.  Code that only deals with a single database 
 at 
 once would be an optimized situation and perform no worse than it does now.

One challenge that jumps to mind here is that an Oid would need to
become a pair (catalog, oid). Even if the end result isn't much more
complex, getting there is not trivial.

 See also how SQLite works; this mount being analogous to their attach.

I'm not sure SQLite is the best example. It has a radically different
architecture.

Regards,
Jeff Davis


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [PERFORM] DELETE taking too much memory

2011-07-08 Thread French, Martin
If the query planner thinks it has the default amount of memory (128MB)
and the stats are out of date, then it will by no means be able to plan
proper execution.

I would recommend setting the effective_cache_size to an appropriate
value, running analyze on both tables with an appropriate stats
target, and then explaining the query again to see if it's more
accurate.

Cheers

-Original Message-
From: Vincent de Phily [mailto:vincent.deph...@mobile-devices.fr] 
Sent: 08 July 2011 10:20
To: French, Martin
Cc: pgsql-general@postgresql.org; pgsql-performa...@postgresql.org
Subject: Re: [PERFORM] DELETE taking too much memory

On Thursday 07 July 2011 19:54:08 French, Martin wrote:
 How up to date are the statistics for the tables in question?
 
 What value do you have for effective cache size?
 
 My guess would be that planner thinks the method it is using is right
 either for its current row number estimations, or the amount of memory
 it thinks it has to play with.

Not very up to date I'm afraid (as shown by the low estimate of deleted
rows). 
Table t2 has been insert-only since its re-creation (that's another
story), 
while t1 is your classic insert-many, update-recent.

We haven't tweaked effective cache size yet, it's on the TODO... like
many 
other things :/
-- 
Vincent de Phily

___ 
  
This email is intended for the named recipient. The information contained 
in it is confidential.  You should not copy it for any purposes, nor 
disclose its contents to any other party.  If you received this email 
in error, please notify the sender immediately via email, and delete it from
your computer. 
  
Any views or opinions presented are solely those of the author and do not 
necessarily represent those of the company. 
  
PCI Compliancy: Please note, we do not send or wish to receive banking, credit
or debit card information by email or any other form of communication. 

Please try our new on-line ordering system at http://www.cromwell.co.uk/ice

Cromwell Tools Limited, PO Box 14, 65 Chartwell Drive
Wigston, Leicester LE18 1AT. Tel 0116 2888000
Registered in England and Wales, Reg No 00986161
VAT GB 115 5713 87 900
__


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-08 Thread mike beeper


I like Darren's proposal.  It is elegant.

 Date: Fri, 8 Jul 2011 18:38:59 +1200
 From: gavinflo...@archidevsys.co.nz
 To: dar...@darrenduncan.net
 CC: pg...@j-davis.com; guilla...@lelarge.info; mbee...@hotmail.com; 
 pgsql-general@postgresql.org; pgsql-hack...@postgresql.org
 Subject: Re: [HACKERS] [GENERAL] Creating temp tables inside read only 
 transactions
 
 On 08/07/11 18:21, Darren Duncan wrote:
  Jeff Davis wrote:
  On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote:
  When you create a temporary table, PostgreSQL needs to add rows in
  pg_class, pg_attribute, and probably other system catalogs. So 
  there are
  writes, which aren't possible in a read-only transaction. Hence the
  error. And no, there is no workaround.
  That sounds like a deficiency to overcome.
 
  It should be possible for those system catalogs to be virtual, 
  defined like union views over similar immutable tables for the 
  read-only database plus mutable in-memory ones for the temporary 
  tables.
 
  Ideally, yes, from a logical standpoint there are catalog entries that
  are only interesting to one backend.
 
  But that doesn't mean it's easy to do. Remember that catalog lookups
  (even though most go through a cache) are a path that is important to
  performance. Also, more complex catalog interpretations may introduce
  some extra bootstrapping challenges.
 
  Are there any plans in the works to do this?
 
  I don't think so. It sounds like some fairly major work for a
  comparatively minor benefit.
 
  Suggestions welcome, of course, to either make the work look more minor
  or the benefits look more major ;)
 
  What I said before was a simplification; below I present my real 
  proposal.
 
  I think an even better way to support this is would be based on 
  Postgres having support for directly using multiple databases within 
  the same SQL session at once, as if namespaces were another level 
  deep, the first level being the databases, the second level the 
  schemas, and the third level the schema objects.
 
  Kind of like what the SQL standard defines its catalog/schema/object 
  namespaces.
 
  This instead of needing to use federating or that contrib module to 
  use multiple Pg databases of the same cluster at once.
 
  Under this scenario, we make the property of a database being 
  read-only or read-write for the current SQL session associated with a 
  database rather than the whole SQL session.  A given transaction can 
  read from any database but can only make changes to the ones not 
  read-only.
 
  Also, the proper way to do temporary tables would be to put them in 
  another database than the main one, where the whole other database has 
  the property of being temporary.
 
  Under this scenario, there would be separate system catalogs for each 
  database, and so the ones for read-only databases are read-only, and 
  the ones for other databases aren't.
 
  Then the system catalog itself fundamentally isn't more complicated, 
  per database, and anything extra to handle cross-database queries or 
  whatever, if anything, is a separate layer.  Code that only deals with 
  a single database at once would be an optimized situation and perform 
  no worse than it does now.
 
  Furthermore, federating databases is done with the same interface, by 
  adding remote/foreign databases as extra databases at the top level 
  namespace.
 
  Fundamentally, a SQL session would be associated with a Pg server, not 
  a database managed by such.  When one starts a SQL session, there are 
  initially no databases visible to them, and the top-level namespace is 
  empty.
 
  They then mount a database, similarly to how one mounts an OS 
  filesystem, by providing appropriate connection info, either just the 
  database name or also user/pass or also remote host etc as is 
  applicable, these details being the difference between using a 
  local/same-Pg-cluster db or a remote/federated one, and the details 
  also say whether it is temporary or initially read-only etc.
 
  See also how SQLite works; this mount being analogous to their 
  attach.
 
  Such a paradigm is also how my Muldis D language interfaces databases; 
  this is the most flexible, portable, extensible, optimizable, and 
  elegant approach I can think of.
 
  -- Darren Duncan
 
 I would suggest that the default action for psql would be as now, 
 associate the session with a database in the name of the current O/S user.
 
 However, use a new psql flag, such as '-unattached' or '-N', to indicate 
 that no database is to be attached when psql starts up.
 
 While I don't have a current need for what you propose, it does look 
 interesting and potentially useful to me.
 
  

Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-08 Thread Gavin Flower

On 08/07/11 18:21, Darren Duncan wrote:

Jeff Davis wrote:

On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote:

When you create a temporary table, PostgreSQL needs to add rows in
pg_class, pg_attribute, and probably other system catalogs. So 
there are

writes, which aren't possible in a read-only transaction. Hence the
error. And no, there is no workaround.

That sounds like a deficiency to overcome.

It should be possible for those system catalogs to be virtual, 
defined like union views over similar immutable tables for the 
read-only database plus mutable in-memory ones for the temporary 
tables.


Ideally, yes, from a logical standpoint there are catalog entries that
are only interesting to one backend.

But that doesn't mean it's easy to do. Remember that catalog lookups
(even though most go through a cache) are a path that is important to
performance. Also, more complex catalog interpretations may introduce
some extra bootstrapping challenges.


Are there any plans in the works to do this?


I don't think so. It sounds like some fairly major work for a
comparatively minor benefit.

Suggestions welcome, of course, to either make the work look more minor
or the benefits look more major ;)


What I said before was a simplification; below I present my real 
proposal.


I think an even better way to support this is would be based on 
Postgres having support for directly using multiple databases within 
the same SQL session at once, as if namespaces were another level 
deep, the first level being the databases, the second level the 
schemas, and the third level the schema objects.


Kind of like what the SQL standard defines its catalog/schema/object 
namespaces.


This instead of needing to use federating or that contrib module to 
use multiple Pg databases of the same cluster at once.


Under this scenario, we make the property of a database being 
read-only or read-write for the current SQL session associated with a 
database rather than the whole SQL session.  A given transaction can 
read from any database but can only make changes to the ones not 
read-only.


Also, the proper way to do temporary tables would be to put them in 
another database than the main one, where the whole other database has 
the property of being temporary.


Under this scenario, there would be separate system catalogs for each 
database, and so the ones for read-only databases are read-only, and 
the ones for other databases aren't.


Then the system catalog itself fundamentally isn't more complicated, 
per database, and anything extra to handle cross-database queries or 
whatever, if anything, is a separate layer.  Code that only deals with 
a single database at once would be an optimized situation and perform 
no worse than it does now.


Furthermore, federating databases is done with the same interface, by 
adding remote/foreign databases as extra databases at the top level 
namespace.


Fundamentally, a SQL session would be associated with a Pg server, not 
a database managed by such.  When one starts a SQL session, there are 
initially no databases visible to them, and the top-level namespace is 
empty.


They then mount a database, similarly to how one mounts an OS 
filesystem, by providing appropriate connection info, either just the 
database name or also user/pass or also remote host etc as is 
applicable, these details being the difference between using a 
local/same-Pg-cluster db or a remote/federated one, and the details 
also say whether it is temporary or initially read-only etc.


See also how SQLite works; this mount being analogous to their 
attach.


Such a paradigm is also how my Muldis D language interfaces databases; 
this is the most flexible, portable, extensible, optimizable, and 
elegant approach I can think of.


-- Darren Duncan

I would suggest that the default action for psql would be as now, 
associate the session with a database in the name of the current O/S user.


However, use a new psql flag, such as '-unattached' or '-N', to indicate 
that no database is to be attached when psql starts up.


While I don't have a current need for what you propose, it does look 
interesting and potentially useful to me.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] 'libpq.lib' linking problem with VC++

2011-07-08 Thread Roy's Email

My apologies if this is not the appropriate list for this, but the novice 
list has provided no response for weeks.
 
I'm using Visual Studios C++ Express 2008 on Windows 7 with the standard 
(pre-packaged) 64-bit Windows distribution of PostgreSQL v.9.0.4 and I am 
having unresolved refs when linking to libpq.lib.  The relevant elements of my 
program are:
 
#include stdafx.h
#include libpq-fe.h
using namespace System;
 
int main(arraySystem::String ^ ^args)
{
  ...
  char *pghost = localhost,
*pgport = 5432,
*pgoptions = NULL,
*pgtty = NULL;
  char *dbName = mydb;
  PGconn *conn;
  /* make a connection to the database */
  conn = PQsetdbLogin(pghost, pgport, pgoptions, pgtty, dbName, NULL, NULL);
  ...
)
 
Linking against libpq.lib in the postgres lib directory, the error I get is:
 
DBTest.obj : error LNK2001: unresolved external symbol extern C struct 
pg_conn * __cdecl PQsetdbLogin( ...
 
In the project properties I have included the path to the postgres lib 
directory and listed libpq.lib as a linker input.

So, what am I missing?  How do Iget this thing to link without error?

Regards, 

- Roy 

=*=*= 
Always do right. 
This will gratify some people and astonish the rest. - Mark Twain 

The truth is rarely pure, and never simple. - Oscar Wilde
 
  

Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-08 Thread Darren Duncan

Jeff Davis wrote:

On Thu, 2011-07-07 at 23:21 -0700, Darren Duncan wrote:
I think an even better way to support this is would be based on Postgres having 
support for directly using multiple databases within the same SQL session at 
once, as if namespaces were another level deep, the first level being the 
databases, the second level the schemas, and the third level the schema objects.

snip
Then the system catalog itself fundamentally isn't more complicated, per 
database, and anything extra to handle cross-database queries or whatever, if 
anything, is a separate layer.  Code that only deals with a single database at 
once would be an optimized situation and perform no worse than it does now.


One challenge that jumps to mind here is that an Oid would need to
become a pair (catalog, oid). Even if the end result isn't much more
complex, getting there is not trivial.


Yes, but that would just be in-memory or in temporary places external to every 
database.  On disk internal to a database there would just be the oid.  In fact, 
another aspect of the database model I defined is that each database is 
entirely self-contained; while you can do cross-database queries, you don't have 
cross-database constraints, in the general case.



See also how SQLite works; this mount being analogous to their attach.


I'm not sure SQLite is the best example. It has a radically different
architecture.


Still, its an example I know of where you can access several clearly separable 
databases at once through a common namespace.  While one might argue this is a 
substitute for multiple schema support, I don't because with multiple schemas 
you can have integrity constraints that cross schemas.  The namespaces issue is 
largely orthogonal to self-containment or integrity in my model.


But look at Oracle too, at least how I understand it.

Oracle supports CONNECT TO ... AUTHORIZE .../etc SQL, meaning you can define 
what databases you are accessing within the SQL session, rather than having to 
do it externally.  I assume that Oracle's features correspond somewhat to my 
proposal, and so enable cross-database queries in the illusion that several 
databases are one.


Suffice it to say, I have thought through my proposed model for years, with one 
of its (and Muldis D's) express purposes in providing a common normalized 
paradigm that all the existing SQL DBMSs can map to with consistent behavior 
whether Oracle or SQLite, and I haven't stated all of it here (a lot more is in 
my published language spec).  Key mapping points are the boundaries of a 
database's self-definability.  And namespace nesting is actually 
arbitrary-depth, so accounting for everything from no native schema support to 
schema plus package namespace support.


-- Darren Duncan

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] long names get truncated

2011-07-08 Thread A . M .
Hello,

I was bitten by a length-truncated role name used in a script since the 
truncation only raises a NOTICE. The symptom was that the some GRANTs ended up 
on the wrong objects after name truncation.

Then, I experimented with tables with long names and was surprised by the 
truncation behavior:

test=# create table 
longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglong(a int);
NOTICE:  identifier 
longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglong will be 
truncated to longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglon
CREATE TABLE
test=# \d longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglong
Did not find any relation named 
longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglong.
test=# drop table 
longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglongNOT; 
--SURPRISE!
NOTICE:  identifier 
longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglongnot will 
be truncated to 
longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglon
DROP TABLE

One really has to pay attention to the length limits (63 bytes):

1) Name truncation is not an error.
2) psql \d doesn't work with long names- perhaps the same auto-truncation rules 
should apply?
3) DROPping a non-existent table with a truncated identifier unintentionally 
drops the long name table.

For those curious, I hit the limits prefixing roles with UUIDs for automated 
testing so that database-global objects can be deleted after the test. 

I wish there were a way to turn the truncation into an error. Is there some 
better way I could have caught this?

Cheers,
M
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-08 Thread Robert Haas
On Fri, Jul 8, 2011 at 2:21 AM, Darren Duncan dar...@darrenduncan.net wrote:
 I think an even better way to support this is would be based on Postgres
 having support for directly using multiple databases within the same SQL
 session at once, as if namespaces were another level deep, the first level
 being the databases, the second level the schemas, and the third level the
 schema objects.

 Kind of like what the SQL standard defines its catalog/schema/object
 namespaces.

 This instead of needing to use federating or that contrib module to use
 multiple Pg databases of the same cluster at once.

But if that's what you want, just don't put your data in different
databases in the first place.  That's what schemas are for.

If for some reason we needed to have tables that happened to be called
x.y.z and a.b.c accessible from a single SQL session, we could allow
that much more simply by allowing schemas to be nested.  Then we could
allow arbitrary numbers of levels, not just three.  The whole point of
having databases and schemas as separate objects is that they do
different things: schemas are just containers for names, allowing
common access to data, and databases are completely separate entities,
allowing privilege separation for (say) a multi-tenant hosting
environment.  We're not going to throw out the latter concept just so
people can use two dots in their table names instead of one.

 Under this scenario, we make the property of a database being read-only or
 read-write for the current SQL session associated with a database rather
 than the whole SQL session.  A given transaction can read from any database
 but can only make changes to the ones not read-only.

 Also, the proper way to do temporary tables would be to put them in another
 database than the main one, where the whole other database has the property
 of being temporary.

 Under this scenario, there would be separate system catalogs for each
 database, and so the ones for read-only databases are read-only, and the
 ones for other databases aren't.

 Then the system catalog itself fundamentally isn't more complicated, per
 database, and anything extra to handle cross-database queries or whatever,
 if anything, is a separate layer.  Code that only deals with a single
 database at once would be an optimized situation and perform no worse than
 it does now.

I think you should make more of an effort to understand how the system
works now, and why, before proposing radical redesigns.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-08 Thread Darren Duncan

Robert Haas wrote:

But if that's what you want, just don't put your data in different
databases in the first place.  That's what schemas are for.

If for some reason we needed to have tables that happened to be called
x.y.z and a.b.c accessible from a single SQL session, we could allow
that much more simply by allowing schemas to be nested.  Then we could
allow arbitrary numbers of levels, not just three.  The whole point of
having databases and schemas as separate objects is that they do
different things: schemas are just containers for names, allowing
common access to data, and databases are completely separate entities,
allowing privilege separation for (say) a multi-tenant hosting
environment.  We're not going to throw out the latter concept just so
people can use two dots in their table names instead of one.


I agree with what you're saying in general and that schema namespaces should be 
nestable to arbitrary levels.  One dot or two isn't an issue I have.


Dividing based on databases or on schemas is a big and important distinction.

I see that the semantic purpose of using multiple databases is to allow things 
to be completely independent and self-defined, where one can understand the 
meaning of any one database in isolation.  So one can take each of the 2 
databases and walk off with them in opposite directions, and each can still be 
used and understood.


Whereas, schemas are namespaces for organizing entities within a single database 
where any of those entities may be interdependent, such as defining a data type 
in one schema and using it as the declared type with a routine or table or 
constraint in another.


But just because you use multiple databases in order for them to be independent, 
sometimes one still wants to use them together, and an abstraction loosely like 
federating is useful here.



I think you should make more of an effort to understand how the system
works now, and why, before proposing radical redesigns.


Well yes, of course.  But that will take time and I think I already understand 
enough about it to make some useful contributions in the meantime.  How much or 
what I already know may not always come across well.  If this bothers people 
then I can make more of an effort to reduce my input until I have more solid 
things to back them up.


-- Darren Duncan

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 If for some reason we needed to have tables that happened to be called
 x.y.z and a.b.c accessible from a single SQL session, we could allow
 that much more simply by allowing schemas to be nested.  Then we could
 allow arbitrary numbers of levels, not just three.

FWIW, I actually tried to do that back when we first introduced schema
support (the fact that the code calls them namespaces and not schemas
is a leftover from that idea).  It turns out to be a whole lot harder
than it sounds, because of the ambiguity you get about which name goes
at what level.  A simple example of this is: if you write x.y in a
query, is that meant to be table x's column y, or is it meant to be
field y within a composite column x of some table in the query?
We've resolved that by requiring you to write (x).y when you mean
the latter, but it's not exactly an intuitive or pleasant answer.
In the same way, if namespaces can be nested to different levels,
it gets really messy to support abbreviations of any sort --- but
the SQL spec requires us to be able to do so.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general