Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-12 Thread Florian Pflug
On Jul11, 2011, at 21:49 , David Johnston wrote:
 Right now I can emulate a hierarchical schema structure via a naming scheme
 - for example  schemabase_sub1_sub2_etc.  I am simply looking for a formal
 way to do the above AND also tell the system that I want all schemas under
 schemabase to be in the search path.  Heck, I guess just allowing for
 simply pattern matching in search_path would be useful in this case
 regardless of the presence of an actual schema hierarchy.  Using LIKE
 syntax say: SET search_path TO schemabase_sub1_% or something similar.

create function set_searchpath_expand(v_pattern text) returns void as $$
declare
  v_searchpath text;
begin
  select string_agg(quote_ident(nspname), ',') into v_searchpath
from pg_catalog.pg_namespace where nspname like v_pattern;
  execute 'set search_path = ' || v_searchpath;
end
$$ language plpgsql;

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-12 Thread Alban Hertroys
On 12 Jul 2011, at 3:25, Chris Travers wrote:

 Right now I can emulate a hierarchical schema structure via a naming scheme
 - for example  schemabase_sub1_sub2_etc.  I am simply looking for a formal
 way to do the above AND also tell the system that I want all schemas under
 schemabase to be in the search path.  Heck, I guess just allowing for
 simply pattern matching in search_path would be useful in this case
 regardless of the presence of an actual schema hierarchy.  Using LIKE
 syntax say: SET search_path TO schemabase_sub1_% or something similar.
 The only missing ability becomes a way for graphical tools to represent the
 schema hierarchy using a tree-structure with multiple depths.
 
 Right.  Semantically myapp_schemaname_subschemaname is no less
 hierarchical than myapp.schemaname.subschemaname.

It is. To the database, your first example is a single identifier, while your 
second example is a sequential combination of three identifiers. The second one 
contains explicit hierarchy, the first one does not.

It's quite possible that the fact that these identifiers have a sequence is the 
biggest problem for implementing this in a relational database. Relational 
databases work with sets after all, which have no explicit sequence. With the 
introduction of recursive queries that's _possible_, but as claimed earlier 
(and I tend to agree), for performance reasons it is undesirable to apply this 
to system tables.

If we were talking about a _set_ of identifiers instead, without the 
requirement of a hierarchy (eg. myapp.schemaname.subschemaname = 
subschemaname.myapp.schemaname), implementation would probably be 
easier/perform better.

That does have some interesting implications for incompletely specified sets of 
namespaces, I'm not sure how desirable they are.
What's cool is that you can specify just a server hostname and a table-name and 
(as long as there's no ambiguity) that's sufficient.
Not so cool, if you use the above and someone clones the database on said host, 
you've suddenly introduced ambiguity.

Problems like that are likely to turn up with incomplete identifier 
specifications though, just a bit more likely to happen if you take the meaning 
of the sequence of the identifiers out. Just a bit.

 The larger issue is that of potential ambiguity wrt cross-database references.


Not necessarily, if the reference to a remote database is unambiguously 
recognisable as such, for example by using some kind of URI notation (eg. 
dsn://user@remote-database1).

I'm also wondering how to handle this for multi-master replicated environments, 
in view of load-balancing. Those remote database references probably need to 
reference different databases depending on which master they're running on?

From a security point-of-view I'd probably require a list of accessible remote 
databases per server (so that people cannot just query any database of their 
choice). That could also serve the load-balancing scenario.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:1286,4e1c1e2912091672620445!



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-11 Thread Darren Duncan

Christopher Browne wrote:

Vis-a-vis the attempt to do nested naming, that is ns1.ns2.table1,
there's a pretty good reason NOT to support that, namely that this
breaks relational handling of tables.  PostgreSQL is a *relational*
database system, hence it's preferable for structures to be
relational, as opposed to hierarchical, which is what any of the
suggested nestings are.


I won't argue with whether or not nested naming is a good idea, but I will argue 
with your other comment about breaking relational handling.


A relational database is a database in which all data is kept in relation-typed 
variables, which SQL calls tables, and you can perform all queries and updates 
with just relation-valued expressions and statements.


Organizing the tables into a multi-level namespace, either fixed-depth or 
variable-depth, rather than using a flat namespace, does not make the database 
any less relational, because the above definition and any others still hold.


The less relational argument above is a red herring or distraction.  One can 
argue against namespace nesting just fine without saying that.


-- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-11 Thread David Johnston
Christopher Browne wrote:
 Vis-a-vis the attempt to do nested naming, that is ns1.ns2.table1, 
 there's a pretty good reason NOT to support that, namely that this 
 breaks relational handling of tables.  PostgreSQL is a *relational* 
 database system, hence it's preferable for structures to be 
 relational, as opposed to hierarchical, which is what any of the 
 suggested nestings are.

Organizing the tables into a multi-level namespace, either fixed-depth or 
variable-depth, rather than using a flat namespace, does not make the database 
any less relational, because the above definition and any others still hold.

The less relational argument above is a red herring or distraction.  One can 
argue against namespace nesting just fine without saying that.

-- Darren Duncan


I agree with Darren.

One thought that came to my mind was to use a different separator between two 
namespaces and/or between the database identifier and the rest of the path.
Examples:

ns1!ns2.table

OR

database@ns1.table

OR

database@ns1!ns2.table

I've been following only some of the discussion but it seems that much 
ambiguity would be lost by using different separators.  Schemas themselves are 
already non-standard so it isn't like we are constrained here in what is chosen.

Just some quick thoughts I've had but haven't fully considered how they would 
fit in to the existing setup.  But is there is any major reason why choosing 
different separators would not work?

Also, within search_path, some form of wild-card selector would be desirable:  
ns1!*.  I'm not opposed to having to be explicit about the search_path in order 
to avoid name collisions; though it would be nice if VIEWS had some kind of 
SET syntax, like functions do, so that the definer can specify the 
search_path that the view will resolve against.

David J.





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-11 Thread Florian Pflug
On Jul8, 2011, at 08:21 , Darren Duncan wrote:
 Also, the proper way to do temporary tables would be to put them in
 another database than the main one, where the whole other database
 has the property of being temporary.

FWIW, Microsoft SQL Server does it that way, and as a result temporary
tables are severely restricted in a number of ways.

For example, custom datatypes defined in a non-temporary database
cannot be used in temporary table definitions, because datatypes may
only be used within the database they're defined in. You can of course
re-define the data type in the temporary database, but then obviously
have to do so every time you start new session because you start out
with an empty tempdb.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-11 Thread Florian Pflug
On Jul11, 2011, at 07:08 , Darren Duncan wrote:
 Christopher Browne wrote:
 Vis-a-vis the attempt to do nested naming, that is ns1.ns2.table1,
 there's a pretty good reason NOT to support that, namely that this
 breaks relational handling of tables.  PostgreSQL is a *relational*
 database system, hence it's preferable for structures to be
 relational, as opposed to hierarchical, which is what any of the
 suggested nestings are.
 
 A relational database is a database in which all data is kept in
 relation-typed variables, which SQL calls tables, and you can perform
 all queries and updates with just relation-valued expressions and
 statements.
 
 Organizing the tables into a multi-level namespace, either fixed-depth
 or variable-depth, rather than using a flat namespace, does not make
 the database any less relational, because the above definition and
 any others still hold.

The point was not, I think, that tables aren't suddenly relations once
namespaces are nested, but that the data model of the dbms *itself*,
i.e. the data model that defines the relationship between namespaces,
types, columns, type, ... becomes harder to map to the relational model.

For example, if namespaces can be nested, you'll need to resort to
recursive SQL and/or arrays far more often if you inspect the structure
of a database.

Btw, another argument against nested namespaces is that it actually
doesn't buy you anything in SQL, even if you solve the parsing
ambiguities. In programming languages, namespaces not only prevent
name clashes, the also defines the possible scopes to resolve unqualified
names with. For example, if you do
  void f() { printf(outer); }
 
  namespace a {
void f() { printf(inner); }
  
namespace b {
  void g() {f();}
}
  }
in C++, then a::b::g() prints inner. But in PostgreSQL, the scope in
which to resolve unqualified function is entirely determined by the the
search_path setting, *not* by the scope of the object containing the
unqualified name. Nested namespaces thus simply become of matter of
syntax - i.e., whether you can write a.b.c, or need to write a.b.c.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-11 Thread Christopher Browne
On Mon, Jul 11, 2011 at 10:12 AM, Florian Pflug f...@phlo.org wrote:
 On Jul11, 2011, at 07:08 , Darren Duncan wrote:
 Christopher Browne wrote:
 Vis-a-vis the attempt to do nested naming, that is ns1.ns2.table1,
 there's a pretty good reason NOT to support that, namely that this
 breaks relational handling of tables.  PostgreSQL is a *relational*
 database system, hence it's preferable for structures to be
 relational, as opposed to hierarchical, which is what any of the
 suggested nestings are.

 A relational database is a database in which all data is kept in
 relation-typed variables, which SQL calls tables, and you can perform
 all queries and updates with just relation-valued expressions and
 statements.

 Organizing the tables into a multi-level namespace, either fixed-depth
 or variable-depth, rather than using a flat namespace, does not make
 the database any less relational, because the above definition and
 any others still hold.

 The point was not, I think, that tables aren't suddenly relations once
 namespaces are nested, but that the data model of the dbms *itself*,
 i.e. the data model that defines the relationship between namespaces,
 types, columns, type, ... becomes harder to map to the relational model.

Just so.

It's not that it suddenly becomes no longer relational.

Rather, the argument is that it was intentional for the structuring
of table naming to, itself, be relational, and changing that
definitely has some undesirable characteristics.

The need for recursive queries is the most obvious undesirable, but
it's not the only undesirable thing, by any means.

Sure, there's some cool stuff that we can get out of nested
namespaces, but I think we'd pay a pretty big price for it, and it
shouldn't be treated as obvious that:
a) It's a good thing to do so,
b) It is desirable to do so,
c) There will be agreement to do so.

To the contrary, there are pretty good reasons to reject the idea.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-11 Thread David Johnston
On Mon, Jul 11, 2011 at 10:12 AM, Florian Pflug f...@phlo.org wrote:
 On Jul11, 2011, at 07:08 , Darren Duncan wrote:
 Christopher Browne wrote:
 Vis-a-vis the attempt to do nested naming, that is ns1.ns2.table1, 
 there's a pretty good reason NOT to support that, namely that this 
 breaks relational handling of tables.  PostgreSQL is a *relational* 
 database system, hence it's preferable for structures to be 
 relational, as opposed to hierarchical, which is what any of the 
 suggested nestings are.

Rather, the argument is that it was intentional for the structuring of
table naming to, itself, be relational, and changing that definitely has
some undesirable characteristics.

The need for recursive queries is the most obvious undesirable, but it's
not the only undesirable thing, by any means.

I do not see how recursive queries (really iteration of records) even enters
the picture...

Right now I can emulate a hierarchical schema structure via a naming scheme
- for example  schemabase_sub1_sub2_etc.  I am simply looking for a formal
way to do the above AND also tell the system that I want all schemas under
schemabase to be in the search path.  Heck, I guess just allowing for
simply pattern matching in search_path would be useful in this case
regardless of the presence of an actual schema hierarchy.  Using LIKE
syntax say: SET search_path TO schemabase_sub1_% or something similar.
The only missing ability becomes a way for graphical tools to represent the
schema hierarchy using a tree-structure with multiple depths.

I can see how adding . and .. and relative paths would confuse the issue
those are not necessary features of a multi-level schema depth.

The above, combined with a different separator for intra-level
namespace/schema delineation, would allow for an unambiguous way to define
and use a hierarchical schema with seemingly minimal invasion into the
current way of doing things. You could almost implement it just by requiring
a specific character to act as the separator and then construct the actual
schema using single-level literals and supporting functions that can convert
them into an hierarchy.  In other words, the schema table would still only
contain one field with the full parent!child as opposed to (schema,
parent) with (VALUES('parent',null),('child','parent')).

In other words, if we use ! as the separator, any schema named
parent!child  could be stored and referenced as such but then if you run a
getChildren(parent) function it would return child along with any other
schemas of the form parent!%.  In this case the % sign could maybe only
match everything except ! and the * symbol could be used to match ! as
well.

I could give more examples but I hope the basic idea is obvious.  The main
thing is that the namespace hierarchy usage is standardized in such a way
that pgAdmin and other GUI tools can reliably use for display purposes and
that search_path can be constructed in a more compact format so that every
schema and sub-schema is still absolutely referenced (you can even have the
SET command resolve search_path at execution time and then remain static
just like CREATE VIEW SELECT * FROM table.

David J.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-11 Thread Darren Duncan
I will put my support for David Johnston's proposal, in principle, though minor 
details of syntax could be changed if using ! conflicts with something. -- 
Darren Duncan


David Johnston wrote:

On Mon, Jul 11, 2011 at 10:12 AM, Florian Pflug f...@phlo.org wrote:

On Jul11, 2011, at 07:08 , Darren Duncan wrote:

Christopher Browne wrote:
Vis-a-vis the attempt to do nested naming, that is ns1.ns2.table1, 
there's a pretty good reason NOT to support that, namely that this 
breaks relational handling of tables.  PostgreSQL is a *relational* 
database system, hence it's preferable for structures to be 
relational, as opposed to hierarchical, which is what any of the 
suggested nestings are.



Rather, the argument is that it was intentional for the structuring of

table naming to, itself, be relational, and changing that definitely has
some undesirable characteristics.


The need for recursive queries is the most obvious undesirable, but it's

not the only undesirable thing, by any means.

I do not see how recursive queries (really iteration of records) even enters
the picture...

Right now I can emulate a hierarchical schema structure via a naming scheme
- for example  schemabase_sub1_sub2_etc.  I am simply looking for a formal
way to do the above AND also tell the system that I want all schemas under
schemabase to be in the search path.  Heck, I guess just allowing for
simply pattern matching in search_path would be useful in this case
regardless of the presence of an actual schema hierarchy.  Using LIKE
syntax say: SET search_path TO schemabase_sub1_% or something similar.
The only missing ability becomes a way for graphical tools to represent the
schema hierarchy using a tree-structure with multiple depths.

I can see how adding . and .. and relative paths would confuse the issue
those are not necessary features of a multi-level schema depth.

The above, combined with a different separator for intra-level
namespace/schema delineation, would allow for an unambiguous way to define
and use a hierarchical schema with seemingly minimal invasion into the
current way of doing things. You could almost implement it just by requiring
a specific character to act as the separator and then construct the actual
schema using single-level literals and supporting functions that can convert
them into an hierarchy.  In other words, the schema table would still only
contain one field with the full parent!child as opposed to (schema,
parent) with (VALUES('parent',null),('child','parent')).

In other words, if we use ! as the separator, any schema named
parent!child  could be stored and referenced as such but then if you run a
getChildren(parent) function it would return child along with any other
schemas of the form parent!%.  In this case the % sign could maybe only
match everything except ! and the * symbol could be used to match ! as
well.

I could give more examples but I hope the basic idea is obvious.  The main
thing is that the namespace hierarchy usage is standardized in such a way
that pgAdmin and other GUI tools can reliably use for display purposes and
that search_path can be constructed in a more compact format so that every
schema and sub-schema is still absolutely referenced (you can even have the
SET command resolve search_path at execution time and then remain static
just like CREATE VIEW SELECT * FROM table.

David J.







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-11 Thread Chris Travers
On Mon, Jul 11, 2011 at 12:49 PM, David Johnston pol...@yahoo.com wrote:

 I do not see how recursive queries (really iteration of records) even enters
 the picture...

I agree, FWIW.  If the feature was that desirable, we could look at
questions of implementation to make recursion either unnecessary or at
least well managed.

 Right now I can emulate a hierarchical schema structure via a naming scheme
 - for example  schemabase_sub1_sub2_etc.  I am simply looking for a formal
 way to do the above AND also tell the system that I want all schemas under
 schemabase to be in the search path.  Heck, I guess just allowing for
 simply pattern matching in search_path would be useful in this case
 regardless of the presence of an actual schema hierarchy.  Using LIKE
 syntax say: SET search_path TO schemabase_sub1_% or something similar.
 The only missing ability becomes a way for graphical tools to represent the
 schema hierarchy using a tree-structure with multiple depths.

Right.  Semantically myapp_schemaname_subschemaname is no less
hierarchical than myapp.schemaname.subschemaname.  The larger issue is
that of potential ambiguity wrt cross-database references (I don't
have a lot of experience reading the SQL standards, but seeing how
different db's implement cross-db references suggests that the
standards contemplate semantic meaning to depth of the namespace).


 I can see how adding . and .. and relative paths would confuse the issue
 those are not necessary features of a multi-level schema depth.

 The above, combined with a different separator for intra-level
 namespace/schema delineation, would allow for an unambiguous way to define
 and use a hierarchical schema with seemingly minimal invasion into the
 current way of doing things. You could almost implement it just by requiring
 a specific character to act as the separator and then construct the actual
 schema using single-level literals and supporting functions that can convert
 them into an hierarchy.  In other words, the schema table would still only
 contain one field with the full parent!child as opposed to (schema,
 parent) with (VALUES('parent',null),('child','parent')).

 In other words, if we use ! as the separator, any schema named
 parent!child  could be stored and referenced as such but then if you run a
 getChildren(parent) function it would return child along with any other
 schemas of the form parent!%.  In this case the % sign could maybe only
 match everything except ! and the * symbol could be used to match ! as
 well.

Agreed that this would be helpful.  I would personally have a lot of
use for this sort of feature, particularly with managing large numbers
of stored procedures.  Right now I am using a double underscore which
is error-prone.

Best Wishes,
Chris Travers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-10 Thread mike beeper


I like Darren's proposal.  It is elegant.

 Date: Fri, 8 Jul 2011 18:38:59 +1200
 From: gavinflo...@archidevsys.co.nz
 To: dar...@darrenduncan.net
 CC: pg...@j-davis.com; guilla...@lelarge.info; mbee...@hotmail.com; 
 pgsql-gene...@postgresql.org; pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] [GENERAL] Creating temp tables inside read only 
 transactions
 
 On 08/07/11 18:21, Darren Duncan wrote:
  Jeff Davis wrote:
  On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote:
  When you create a temporary table, PostgreSQL needs to add rows in
  pg_class, pg_attribute, and probably other system catalogs. So 
  there are
  writes, which aren't possible in a read-only transaction. Hence the
  error. And no, there is no workaround.
  That sounds like a deficiency to overcome.
 
  It should be possible for those system catalogs to be virtual, 
  defined like union views over similar immutable tables for the 
  read-only database plus mutable in-memory ones for the temporary 
  tables.
 
  Ideally, yes, from a logical standpoint there are catalog entries that
  are only interesting to one backend.
 
  But that doesn't mean it's easy to do. Remember that catalog lookups
  (even though most go through a cache) are a path that is important to
  performance. Also, more complex catalog interpretations may introduce
  some extra bootstrapping challenges.
 
  Are there any plans in the works to do this?
 
  I don't think so. It sounds like some fairly major work for a
  comparatively minor benefit.
 
  Suggestions welcome, of course, to either make the work look more minor
  or the benefits look more major ;)
 
  What I said before was a simplification; below I present my real 
  proposal.
 
  I think an even better way to support this is would be based on 
  Postgres having support for directly using multiple databases within 
  the same SQL session at once, as if namespaces were another level 
  deep, the first level being the databases, the second level the 
  schemas, and the third level the schema objects.
 
  Kind of like what the SQL standard defines its catalog/schema/object 
  namespaces.
 
  This instead of needing to use federating or that contrib module to 
  use multiple Pg databases of the same cluster at once.
 
  Under this scenario, we make the property of a database being 
  read-only or read-write for the current SQL session associated with a 
  database rather than the whole SQL session.  A given transaction can 
  read from any database but can only make changes to the ones not 
  read-only.
 
  Also, the proper way to do temporary tables would be to put them in 
  another database than the main one, where the whole other database has 
  the property of being temporary.
 
  Under this scenario, there would be separate system catalogs for each 
  database, and so the ones for read-only databases are read-only, and 
  the ones for other databases aren't.
 
  Then the system catalog itself fundamentally isn't more complicated, 
  per database, and anything extra to handle cross-database queries or 
  whatever, if anything, is a separate layer.  Code that only deals with 
  a single database at once would be an optimized situation and perform 
  no worse than it does now.
 
  Furthermore, federating databases is done with the same interface, by 
  adding remote/foreign databases as extra databases at the top level 
  namespace.
 
  Fundamentally, a SQL session would be associated with a Pg server, not 
  a database managed by such.  When one starts a SQL session, there are 
  initially no databases visible to them, and the top-level namespace is 
  empty.
 
  They then mount a database, similarly to how one mounts an OS 
  filesystem, by providing appropriate connection info, either just the 
  database name or also user/pass or also remote host etc as is 
  applicable, these details being the difference between using a 
  local/same-Pg-cluster db or a remote/federated one, and the details 
  also say whether it is temporary or initially read-only etc.
 
  See also how SQLite works; this mount being analogous to their 
  attach.
 
  Such a paradigm is also how my Muldis D language interfaces databases; 
  this is the most flexible, portable, extensible, optimizable, and 
  elegant approach I can think of.
 
  -- Darren Duncan
 
 I would suggest that the default action for psql would be as now, 
 associate the session with a database in the name of the current O/S user.
 
 However, use a new psql flag, such as '-unattached' or '-N', to indicate 
 that no database is to be attached when psql starts up.
 
 While I don't have a current need for what you propose, it does look 
 interesting and potentially useful to me.
 
  

Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-10 Thread Gavin Flower

On 08/07/11 18:21, Darren Duncan wrote:

Jeff Davis wrote:

On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote:

When you create a temporary table, PostgreSQL needs to add rows in
pg_class, pg_attribute, and probably other system catalogs. So 
there are

writes, which aren't possible in a read-only transaction. Hence the
error. And no, there is no workaround.

That sounds like a deficiency to overcome.

It should be possible for those system catalogs to be virtual, 
defined like union views over similar immutable tables for the 
read-only database plus mutable in-memory ones for the temporary 
tables.


Ideally, yes, from a logical standpoint there are catalog entries that
are only interesting to one backend.

But that doesn't mean it's easy to do. Remember that catalog lookups
(even though most go through a cache) are a path that is important to
performance. Also, more complex catalog interpretations may introduce
some extra bootstrapping challenges.


Are there any plans in the works to do this?


I don't think so. It sounds like some fairly major work for a
comparatively minor benefit.

Suggestions welcome, of course, to either make the work look more minor
or the benefits look more major ;)


What I said before was a simplification; below I present my real 
proposal.


I think an even better way to support this is would be based on 
Postgres having support for directly using multiple databases within 
the same SQL session at once, as if namespaces were another level 
deep, the first level being the databases, the second level the 
schemas, and the third level the schema objects.


Kind of like what the SQL standard defines its catalog/schema/object 
namespaces.


This instead of needing to use federating or that contrib module to 
use multiple Pg databases of the same cluster at once.


Under this scenario, we make the property of a database being 
read-only or read-write for the current SQL session associated with a 
database rather than the whole SQL session.  A given transaction can 
read from any database but can only make changes to the ones not 
read-only.


Also, the proper way to do temporary tables would be to put them in 
another database than the main one, where the whole other database has 
the property of being temporary.


Under this scenario, there would be separate system catalogs for each 
database, and so the ones for read-only databases are read-only, and 
the ones for other databases aren't.


Then the system catalog itself fundamentally isn't more complicated, 
per database, and anything extra to handle cross-database queries or 
whatever, if anything, is a separate layer.  Code that only deals with 
a single database at once would be an optimized situation and perform 
no worse than it does now.


Furthermore, federating databases is done with the same interface, by 
adding remote/foreign databases as extra databases at the top level 
namespace.


Fundamentally, a SQL session would be associated with a Pg server, not 
a database managed by such.  When one starts a SQL session, there are 
initially no databases visible to them, and the top-level namespace is 
empty.


They then mount a database, similarly to how one mounts an OS 
filesystem, by providing appropriate connection info, either just the 
database name or also user/pass or also remote host etc as is 
applicable, these details being the difference between using a 
local/same-Pg-cluster db or a remote/federated one, and the details 
also say whether it is temporary or initially read-only etc.


See also how SQLite works; this mount being analogous to their 
attach.


Such a paradigm is also how my Muldis D language interfaces databases; 
this is the most flexible, portable, extensible, optimizable, and 
elegant approach I can think of.


-- Darren Duncan

I would suggest that the default action for psql would be as now, 
associate the session with a database in the name of the current O/S user.


However, use a new psql flag, such as '-unattached' or '-N', to indicate 
that no database is to be attached when psql starts up.


While I don't have a current need for what you propose, it does look 
interesting and potentially useful to me.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-10 Thread Michael Nolan
On Fri, Jul 8, 2011 at 10:27 PM, Robert Haas robertmh...@gmail.com wrote:


 But if that's what you want, just don't put your data in different
 databases in the first place.  That's what schemas are for.


Sadly, DBAs don't always have the ability to put all their data in one
database, even if that is what schemas are for.

The ability to do cross-database (most likely cross-server as well) queries
would address a lot of real-world problems.
-
Mike Nolan
no...@tssi.com


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-10 Thread Christopher Browne
On Mon, Jul 11, 2011 at 12:01 AM, Michael Nolan htf...@gmail.com wrote:


 On Fri, Jul 8, 2011 at 10:27 PM, Robert Haas robertmh...@gmail.com wrote:

 But if that's what you want, just don't put your data in different
 databases in the first place.  That's what schemas are for.

 Sadly, DBAs don't always have the ability to put all their data in one
 database, even if that is what schemas are for.

 The ability to do cross-database (most likely cross-server as well) queries
 would address a lot of real-world problems.

Sure, there's quite a lot of use to that sort of thing.

Once 9.1 is out, with the FDW (Foreign Data Wrapper) concept, it'll
make a lot of sense to add FDW methods for various sorts of access
methods, including accessing other PG instances, cross-server.

We already have a fine mechanism for this; no need to create some
different mechanism out of whole cloth.

Vis-a-vis the attempt to do nested naming, that is ns1.ns2.table1,
there's a pretty good reason NOT to support that, namely that this
breaks relational handling of tables.  PostgreSQL is a *relational*
database system, hence it's preferable for structures to be
relational, as opposed to hierarchical, which is what any of the
suggested nestings are.

Having to write recursive queries just to look for fully qualified
table names is much more of a bug than it is a feature.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-09 Thread Jeff Davis
On Fri, 2011-07-08 at 12:34 -0700, Darren Duncan wrote:
 Yes, but that would just be in-memory or in temporary places external to 
 every 
 database.  On disk internal to a database there would just be the oid.  In 
 fact, 
 another aspect of the database model I defined is that each database is 
 entirely self-contained; while you can do cross-database queries, you don't 
 have 
 cross-database constraints, in the general case.

Yes, you can have a local oid and a fully-qualified oid. It sounds
like it might take some effort (which is an understatement) to go
through the system and figure out which ones should be local and which
ones should be fully-qualified.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-09 Thread Darren Duncan

Tom Lane wrote:

Robert Haas robertmh...@gmail.com writes:

If for some reason we needed to have tables that happened to be called
x.y.z and a.b.c accessible from a single SQL session, we could allow
that much more simply by allowing schemas to be nested.  Then we could
allow arbitrary numbers of levels, not just three.


FWIW, I actually tried to do that back when we first introduced schema
support (the fact that the code calls them namespaces and not schemas
is a leftover from that idea).  It turns out to be a whole lot harder
than it sounds, because of the ambiguity you get about which name goes
at what level.  A simple example of this is: if you write x.y in a
query, is that meant to be table x's column y, or is it meant to be
field y within a composite column x of some table in the query?
We've resolved that by requiring you to write (x).y when you mean
the latter, but it's not exactly an intuitive or pleasant answer.
In the same way, if namespaces can be nested to different levels,
it gets really messy to support abbreviations of any sort --- but
the SQL spec requires us to be able to do so.


What if you used the context of the calling code and resolve in favor of 
whatever match is closest to it?  The problem is related to general-purpose 
programming languages.


Basically start looking in the lexical context for an x and if you find one 
use that; otherwise, assuming we're talking about referencing code that lives in 
the database such as a function, look at the innermost schema containing the 
referencing code and see if it has a direct child named x; otherwise go up one 
level to a parent schema, and so on until you get to the top, and finding none 
by then say it doesn't exist.


If there are several x in this search sequence, only use the first one 
regardless of whether it has a y, so to prevent bugs from too much complexity. 
 Same for just looking for x by itself in fact, not just an x.y.


For the case of calling code that doesn't live in the database such as a 
client-side query, I believe there are session variables like current schema 
or such, and you can use this as the starting point for the search for x, 
looking first at what that schema directly contains, and then its parent, and so on.


Something like that.

Or ignore what I said about starting in a lexical context and do what you 
already do there, but keep what I said about relative order of schemas to 
search, only searching direct children of ancestors of the current code's 
context schema starting with the current context.


You could also come up with some relative name syntax such as filesystems 
support with their ../ and such, but that's further from standard SQL.


-- Darren Duncan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-09 Thread Jeff Davis
On Fri, 2011-07-08 at 21:04 -0700, Darren Duncan wrote:
  I think you should make more of an effort to understand how the system
  works now, and why, before proposing radical redesigns.
 
 Well yes, of course.  But that will take time and I think I already 
 understand 
 enough about it to make some useful contributions in the meantime.  How much 
 or 
 what I already know may not always come across well.  If this bothers people 
 then I can make more of an effort to reduce my input until I have more solid 
 things to back them up.

I don't think anyone expects you to understand all the internal APIs in
postgres before you make a proposal. But we do expect you to look
critically at your own proposals with the status quo (i.e. existing
code, users, and standards) in mind. And that probably means poking at
the code a little to see if you find stumbling blocks, and asking
questions to try to trace out the shape of the project.

I'm hoping that we can learn a lot from your work on Muldis D. In
particular, the type system might be the most fertile ground -- you've
clearly done some interesting things there, and I think we've felt some
pressure to improve the type system from a number of different
projects*.

Regards,
Jeff Davis

* That being said, PostgreSQL's type system is actually very good.
Consider the sophisticated type infrastructure (or at least plumbing
around the type system) required to make KNN-GiST work, for instance.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-09 Thread Jeff Davis
On Fri, 2011-07-08 at 23:39 -0700, Darren Duncan wrote:
 What if you used the context of the calling code and resolve in favor of 
 whatever match is closest to it?  The problem is related to general-purpose 
 programming languages.
 
 Basically start looking in the lexical context for an x and if you find one 
 use that; otherwise, assuming we're talking about referencing code that lives 
 in 
 the database such as a function, look at the innermost schema containing the 
 referencing code and see if it has a direct child named x; otherwise go up 
 one 
 level to a parent schema, and so on until you get to the top, and finding 
 none 
 by then say it doesn't exist.

This is an example of where data languages and normal programming
languages have a crucial difference.

With a data language, you have this problem:
 1. An application uses a query referencing 'y.z.foo' that resolves to
internal object with fully-qualified name 'x.y.z'.
 2. An administrator creates object 'y.z.foo'.

Now, the application breaks all of a sudden.

In a normal prgramming language, if the schema of the two foos are
different, the compiler could probably catch the error. SQL really has
no hope of catching it though.

PostgreSQL has this problem now in a couple ways, but it's much easier
to grasp what you might be conflicting with. If you have multiple nested
levels to traverse and different queries using different levels of
qualification, it gets a little more messy and I think a mistake is more
likely.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-09 Thread Darren Duncan

Jeff Davis wrote:

On Fri, 2011-07-08 at 23:39 -0700, Darren Duncan wrote:
What if you used the context of the calling code and resolve in favor of 
whatever match is closest to it?  The problem is related to general-purpose 
programming languages.


Basically start looking in the lexical context for an x and if you find one 
use that; otherwise, assuming we're talking about referencing code that lives in 
the database such as a function, look at the innermost schema containing the 
referencing code and see if it has a direct child named x; otherwise go up one 
level to a parent schema, and so on until you get to the top, and finding none 
by then say it doesn't exist.


This is an example of where data languages and normal programming
languages have a crucial difference.

With a data language, you have this problem:
 1. An application uses a query referencing 'y.z.foo' that resolves to
internal object with fully-qualified name 'x.y.z'.
 2. An administrator creates object 'y.z.foo'.

Now, the application breaks all of a sudden.

In a normal prgramming language, if the schema of the two foos are
different, the compiler could probably catch the error. SQL really has
no hope of catching it though.

PostgreSQL has this problem now in a couple ways, but it's much easier
to grasp what you might be conflicting with. If you have multiple nested
levels to traverse and different queries using different levels of
qualification, it gets a little more messy and I think a mistake is more
likely.


Well, my search path suggestion was based on Tom Lane's comment that the SQL 
spec requires us to be able to [support abbreviations] and I expected it would 
be syntactically and semantically backwards compatible with how things work now.


FYI, with Muldis D, being more green fields, there are no search paths in the 
general case, and every entity reference is unambiguous because it has to be 
fully-qualified.


However, I also support relative references, and in fact require their use for 
references within the same database, which carries a number of benefits, at the 
cost of being a few characters more verbose than when using a search path.  So 
introducing new things with the same names in different namespaces won't break 
anything there, even if they are closer.  Its essentially like navigating a 
Unix filesystem but with . rather than /.


So for example, if you had 2 sibling schemas s1 and s2, each with 2 
functions f1,f2 and a table t, then s1.f1 would reference s1.f2 and s1.t 
as sch.lib.f2 and sch.data.t respectively, while s1.f1 would refer to the 
entities in s2 as sch.par.s2.lib.f1 and sch.par.s2.data.t and such (a function 
can also refer to itself anonymously as rtn if it's recursive).  The sch is 
like . in Unix and the par is like .. in Unix.  The data is for data 
tables or views (and cat is for catalog tables/views) while lib is for 
user-defined types, routines, constraints, etc (and sys is for built-in types 
and routines, but sys may be omitted and search paths exist just for 
built-ins).  Synonyms are also supported.


I don't expect you would adopt relative (fully-qualified) references, because 
the syntax isn't in standard SQL (I think), but I did.  Unless you like them and 
can come up with a syntax that will fit into how SQL does things.


-- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-08 Thread Darren Duncan

Jeff Davis wrote:

On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote:

When you create a temporary table, PostgreSQL needs to add rows in
pg_class, pg_attribute, and probably other system catalogs. So there are
writes, which aren't possible in a read-only transaction. Hence the
error. And no, there is no workaround.

That sounds like a deficiency to overcome.

It should be possible for those system catalogs to be virtual, defined like 
union views over similar immutable tables for the read-only database plus 
mutable in-memory ones for the temporary tables.


Ideally, yes, from a logical standpoint there are catalog entries that
are only interesting to one backend.

But that doesn't mean it's easy to do. Remember that catalog lookups
(even though most go through a cache) are a path that is important to
performance. Also, more complex catalog interpretations may introduce
some extra bootstrapping challenges.


Are there any plans in the works to do this?


I don't think so. It sounds like some fairly major work for a
comparatively minor benefit.

Suggestions welcome, of course, to either make the work look more minor
or the benefits look more major ;)


What I said before was a simplification; below I present my real proposal.

I think an even better way to support this is would be based on Postgres having 
support for directly using multiple databases within the same SQL session at 
once, as if namespaces were another level deep, the first level being the 
databases, the second level the schemas, and the third level the schema objects.


Kind of like what the SQL standard defines its catalog/schema/object namespaces.

This instead of needing to use federating or that contrib module to use multiple 
Pg databases of the same cluster at once.


Under this scenario, we make the property of a database being read-only or 
read-write for the current SQL session associated with a database rather than 
the whole SQL session.  A given transaction can read from any database but can 
only make changes to the ones not read-only.


Also, the proper way to do temporary tables would be to put them in another 
database than the main one, where the whole other database has the property of 
being temporary.


Under this scenario, there would be separate system catalogs for each database, 
and so the ones for read-only databases are read-only, and the ones for other 
databases aren't.


Then the system catalog itself fundamentally isn't more complicated, per 
database, and anything extra to handle cross-database queries or whatever, if 
anything, is a separate layer.  Code that only deals with a single database at 
once would be an optimized situation and perform no worse than it does now.


Furthermore, federating databases is done with the same interface, by adding 
remote/foreign databases as extra databases at the top level namespace.


Fundamentally, a SQL session would be associated with a Pg server, not a 
database managed by such.  When one starts a SQL session, there are initially no 
databases visible to them, and the top-level namespace is empty.


They then mount a database, similarly to how one mounts an OS filesystem, by 
providing appropriate connection info, either just the database name or also 
user/pass or also remote host etc as is applicable, these details being the 
difference between using a local/same-Pg-cluster db or a remote/federated one, 
and the details also say whether it is temporary or initially read-only etc.


See also how SQLite works; this mount being analogous to their attach.

Such a paradigm is also how my Muldis D language interfaces databases; this is 
the most flexible, portable, extensible, optimizable, and elegant approach I can 
think of.


-- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-08 Thread Jeff Davis
On Thu, 2011-07-07 at 23:21 -0700, Darren Duncan wrote:
 I think an even better way to support this is would be based on Postgres 
 having 
 support for directly using multiple databases within the same SQL session at 
 once, as if namespaces were another level deep, the first level being the 
 databases, the second level the schemas, and the third level the schema 
 objects.
 
 Kind of like what the SQL standard defines its catalog/schema/object 
 namespaces.
 
 This instead of needing to use federating or that contrib module to use 
 multiple 
 Pg databases of the same cluster at once.
 
 Under this scenario, we make the property of a database being read-only or 
 read-write for the current SQL session associated with a database rather than 
 the whole SQL session.  A given transaction can read from any database but 
 can 
 only make changes to the ones not read-only.
 
 Also, the proper way to do temporary tables would be to put them in another 
 database than the main one, where the whole other database has the property 
 of 
 being temporary.
 
 Under this scenario, there would be separate system catalogs for each 
 database, 
 and so the ones for read-only databases are read-only, and the ones for other 
 databases aren't.
 
 Then the system catalog itself fundamentally isn't more complicated, per 
 database, and anything extra to handle cross-database queries or whatever, if 
 anything, is a separate layer.  Code that only deals with a single database 
 at 
 once would be an optimized situation and perform no worse than it does now.

One challenge that jumps to mind here is that an Oid would need to
become a pair (catalog, oid). Even if the end result isn't much more
complex, getting there is not trivial.

 See also how SQLite works; this mount being analogous to their attach.

I'm not sure SQLite is the best example. It has a radically different
architecture.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-08 Thread Darren Duncan

Jeff Davis wrote:

On Thu, 2011-07-07 at 23:21 -0700, Darren Duncan wrote:
I think an even better way to support this is would be based on Postgres having 
support for directly using multiple databases within the same SQL session at 
once, as if namespaces were another level deep, the first level being the 
databases, the second level the schemas, and the third level the schema objects.

snip
Then the system catalog itself fundamentally isn't more complicated, per 
database, and anything extra to handle cross-database queries or whatever, if 
anything, is a separate layer.  Code that only deals with a single database at 
once would be an optimized situation and perform no worse than it does now.


One challenge that jumps to mind here is that an Oid would need to
become a pair (catalog, oid). Even if the end result isn't much more
complex, getting there is not trivial.


Yes, but that would just be in-memory or in temporary places external to every 
database.  On disk internal to a database there would just be the oid.  In fact, 
another aspect of the database model I defined is that each database is 
entirely self-contained; while you can do cross-database queries, you don't have 
cross-database constraints, in the general case.



See also how SQLite works; this mount being analogous to their attach.


I'm not sure SQLite is the best example. It has a radically different
architecture.


Still, its an example I know of where you can access several clearly separable 
databases at once through a common namespace.  While one might argue this is a 
substitute for multiple schema support, I don't because with multiple schemas 
you can have integrity constraints that cross schemas.  The namespaces issue is 
largely orthogonal to self-containment or integrity in my model.


But look at Oracle too, at least how I understand it.

Oracle supports CONNECT TO ... AUTHORIZE .../etc SQL, meaning you can define 
what databases you are accessing within the SQL session, rather than having to 
do it externally.  I assume that Oracle's features correspond somewhat to my 
proposal, and so enable cross-database queries in the illusion that several 
databases are one.


Suffice it to say, I have thought through my proposed model for years, with one 
of its (and Muldis D's) express purposes in providing a common normalized 
paradigm that all the existing SQL DBMSs can map to with consistent behavior 
whether Oracle or SQLite, and I haven't stated all of it here (a lot more is in 
my published language spec).  Key mapping points are the boundaries of a 
database's self-definability.  And namespace nesting is actually 
arbitrary-depth, so accounting for everything from no native schema support to 
schema plus package namespace support.


-- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-08 Thread Robert Haas
On Fri, Jul 8, 2011 at 2:21 AM, Darren Duncan dar...@darrenduncan.net wrote:
 I think an even better way to support this is would be based on Postgres
 having support for directly using multiple databases within the same SQL
 session at once, as if namespaces were another level deep, the first level
 being the databases, the second level the schemas, and the third level the
 schema objects.

 Kind of like what the SQL standard defines its catalog/schema/object
 namespaces.

 This instead of needing to use federating or that contrib module to use
 multiple Pg databases of the same cluster at once.

But if that's what you want, just don't put your data in different
databases in the first place.  That's what schemas are for.

If for some reason we needed to have tables that happened to be called
x.y.z and a.b.c accessible from a single SQL session, we could allow
that much more simply by allowing schemas to be nested.  Then we could
allow arbitrary numbers of levels, not just three.  The whole point of
having databases and schemas as separate objects is that they do
different things: schemas are just containers for names, allowing
common access to data, and databases are completely separate entities,
allowing privilege separation for (say) a multi-tenant hosting
environment.  We're not going to throw out the latter concept just so
people can use two dots in their table names instead of one.

 Under this scenario, we make the property of a database being read-only or
 read-write for the current SQL session associated with a database rather
 than the whole SQL session.  A given transaction can read from any database
 but can only make changes to the ones not read-only.

 Also, the proper way to do temporary tables would be to put them in another
 database than the main one, where the whole other database has the property
 of being temporary.

 Under this scenario, there would be separate system catalogs for each
 database, and so the ones for read-only databases are read-only, and the
 ones for other databases aren't.

 Then the system catalog itself fundamentally isn't more complicated, per
 database, and anything extra to handle cross-database queries or whatever,
 if anything, is a separate layer.  Code that only deals with a single
 database at once would be an optimized situation and perform no worse than
 it does now.

I think you should make more of an effort to understand how the system
works now, and why, before proposing radical redesigns.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-08 Thread Darren Duncan

Robert Haas wrote:

But if that's what you want, just don't put your data in different
databases in the first place.  That's what schemas are for.

If for some reason we needed to have tables that happened to be called
x.y.z and a.b.c accessible from a single SQL session, we could allow
that much more simply by allowing schemas to be nested.  Then we could
allow arbitrary numbers of levels, not just three.  The whole point of
having databases and schemas as separate objects is that they do
different things: schemas are just containers for names, allowing
common access to data, and databases are completely separate entities,
allowing privilege separation for (say) a multi-tenant hosting
environment.  We're not going to throw out the latter concept just so
people can use two dots in their table names instead of one.


I agree with what you're saying in general and that schema namespaces should be 
nestable to arbitrary levels.  One dot or two isn't an issue I have.


Dividing based on databases or on schemas is a big and important distinction.

I see that the semantic purpose of using multiple databases is to allow things 
to be completely independent and self-defined, where one can understand the 
meaning of any one database in isolation.  So one can take each of the 2 
databases and walk off with them in opposite directions, and each can still be 
used and understood.


Whereas, schemas are namespaces for organizing entities within a single database 
where any of those entities may be interdependent, such as defining a data type 
in one schema and using it as the declared type with a routine or table or 
constraint in another.


But just because you use multiple databases in order for them to be independent, 
sometimes one still wants to use them together, and an abstraction loosely like 
federating is useful here.



I think you should make more of an effort to understand how the system
works now, and why, before proposing radical redesigns.


Well yes, of course.  But that will take time and I think I already understand 
enough about it to make some useful contributions in the meantime.  How much or 
what I already know may not always come across well.  If this bothers people 
then I can make more of an effort to reduce my input until I have more solid 
things to back them up.


-- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 If for some reason we needed to have tables that happened to be called
 x.y.z and a.b.c accessible from a single SQL session, we could allow
 that much more simply by allowing schemas to be nested.  Then we could
 allow arbitrary numbers of levels, not just three.

FWIW, I actually tried to do that back when we first introduced schema
support (the fact that the code calls them namespaces and not schemas
is a leftover from that idea).  It turns out to be a whole lot harder
than it sounds, because of the ambiguity you get about which name goes
at what level.  A simple example of this is: if you write x.y in a
query, is that meant to be table x's column y, or is it meant to be
field y within a composite column x of some table in the query?
We've resolved that by requiring you to write (x).y when you mean
the latter, but it's not exactly an intuitive or pleasant answer.
In the same way, if namespaces can be nested to different levels,
it gets really messy to support abbreviations of any sort --- but
the SQL spec requires us to be able to do so.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-07 Thread Darren Duncan

Guillaume Lelarge wrote [on pgsql-general]:

On Thu, 2011-07-07 at 16:01 +, mike beeper wrote [on pgsql-general]:

I have a function that creates a temp table, populate it with results
during intermediate processing, and reads from it at the end.  When
the transaction is marked as read only, it does not allow creation of
temp table, even though there are no permanent writes to the db.  Are
there any workarounds? The following block errors out.

SET TRANSACTION ISOLATION LEVEL READ COMMITTED READ ONLY;
create temp table test(test int);


When you create a temporary table, PostgreSQL needs to add rows in
pg_class, pg_attribute, and probably other system catalogs. So there are
writes, which aren't possible in a read-only transaction. Hence the
error. And no, there is no workaround.


That sounds like a deficiency to overcome.

It should be possible for those system catalogs to be virtual, defined like 
union views over similar immutable tables for the read-only database plus 
mutable in-memory ones for the temporary tables.


Are there any plans in the works to do this?

On the other hand, if one can have lexical-scope tables (table-typed routine 
variables), and I know Pg 8.4+ has named subqueries which handle a lot of cases 
where temp tables would otherwise be used, I would certainly expect those to 
work when you're dealing with a readonly database.


-- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-07 Thread Jeff Davis
On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote:
  When you create a temporary table, PostgreSQL needs to add rows in
  pg_class, pg_attribute, and probably other system catalogs. So there are
  writes, which aren't possible in a read-only transaction. Hence the
  error. And no, there is no workaround.
 
 That sounds like a deficiency to overcome.
 
 It should be possible for those system catalogs to be virtual, defined like 
 union views over similar immutable tables for the read-only database plus 
 mutable in-memory ones for the temporary tables.

Ideally, yes, from a logical standpoint there are catalog entries that
are only interesting to one backend.

But that doesn't mean it's easy to do. Remember that catalog lookups
(even though most go through a cache) are a path that is important to
performance. Also, more complex catalog interpretations may introduce
some extra bootstrapping challenges.

 Are there any plans in the works to do this?

I don't think so. It sounds like some fairly major work for a
comparatively minor benefit.

Suggestions welcome, of course, to either make the work look more minor
or the benefits look more major ;)

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers