Re: [PATCHES] Partitioning docs

2005-10-31 Thread Simon Riggs
On Mon, 2005-10-31 at 02:46 +, Simon Riggs wrote:
> I've been working on some docs for Constraining Exclusion & Partitioning
> for some time now. Deadlines seem to be looming, or may even have
> passed, so it seems sensible to submit what I have now. 

> Many thanks to Josh Berkus for providing the numbered section on
> implementation process, which was the starting point I'd been looking
> for to describe everything else.

I believe this is now complete and ready for application.

- passes sgml make against cvstip
- spellchecked
- all code executed correctly against RC1

Comments please? Apart from the obvious, so why did it take you so long.
Apologies to the translators.

Best Regards, Simon Riggs

Index: ddl.sgml
===
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/ddl.sgml,v
retrieving revision 1.45
diff -c -c -r1.45 ddl.sgml
*** ddl.sgml	23 Oct 2005 19:29:49 -	1.45
--- ddl.sgml	31 Oct 2005 22:36:26 -
***
*** 398,403 
--- 398,410 
  ensure that a column does not contain null values, the not-null
  constraint described in the next section can be used.
 
+ 
+ 
+  Check constraints can also be used to enhance performance with
+  very large tables, when used in conjunction with the 
+   parameter. 
+  This is discussed in more detail in 
+

  

***
*** 1040,1052 
   
Inheritance
  
!   This section needs to be rethought.  Some of the
!   information should go into the following chapters.
  

!Let's create two tables. The capitals  table  contains
!state  capitals  which  are also cities. Naturally, the
!capitals table should inherit from cities.
  
  
  CREATE TABLE cities (
--- 1047,1081 
   
Inheritance
  
!
! not-null constraint
!
! 
!
! constraint
! NOT NULL
!
! 
!   
!PostgreSQL was the first DBMS to introduce
!inheritance, one of its many object-relational features.
!PostgreSQL implements table
!inheritance which can be a useful tool for database designers. 
!The SQL:2003 standard optionally defines type inheritance which differs
!in many respects from the features described here.
!   
  

!Let's start with an example: 
!We're trying to build a data model for cities, but we have a problem.
!Each state has many cities, but only one capital. We want to be able
!to quickly retrieve the capital city for any particular state. We
!can solve this problem by creating two tables. The capitals table contains
!state  capitals, then we have another table for cities that aren't capitals.
!What happens when we want to ask for data about a city, regardless of
!whether it is a capital or not? We can use the inheritance feature to
!help resolve this problem for us. We define the capitals table so that
!it inherits from cities.
  
  
  CREATE TABLE cities (
***
*** 1062,1077 
  
 In this case, a row of capitals inherits all
 attributes (name, population, and altitude) from its parent, cities.  State
!capitals have an extra attribute, state, that shows their state.  In
!PostgreSQL, a table can inherit from zero or
 more other tables, and a query can reference either all rows of a table or
 all rows of a table plus all of its descendants.
- 
-
- 
-  The inheritance hierarchy is actually a directed acyclic graph.
- 
-

  

--- 1091,1102 
  
 In this case, a row of capitals inherits all
 attributes (name, population, and altitude) from its parent, cities.  State
!capitals have an extra attribute, state, that shows their state.  
!   
!   
!In PostgreSQL, a table can inherit from zero or
 more other tables, and a query can reference either all rows of a table or
 all rows of a table plus all of its descendants.

  

***
*** 1133,1163 

  

!   
!Deprecated 
!
!  In previous versions of PostgreSQL, the
!  default behavior was not to include child tables in queries. This was
!  found to be error prone and is also in violation of the SQL:2003
!  standard. Under the old syntax, to get the sub-tables you append
!  * to the table name.
!  For example
! 
! SELECT * from cities*;
! 
!  You can still explicitly specify scanning child tables by appending
!  *, as well as explicitly specify not scanning child tables by
!  writing ONLY.  But beginning in version 7.1, the default
!  behavior for an undecorated table name is to scan its child tables
!  too, whereas before the default was not to do so.  To get the old
!  default behavior, set the configuration option
!  SQL_Inheritance to off, e.g.,
! 
! SET SQL_Inheritance TO OFF;
! 
!  or add a line in your postgresql.conf file.
!
!   
  

In some cases you may wish to know which table a particular row
--- 1158,1

Re: [PATCHES] Partitioning docs

2005-10-31 Thread Neil Conway
On Mon, 2005-31-10 at 22:41 +, Simon Riggs wrote:
> I believe this is now complete and ready for application.

The changes need a fair bit of copy editing and SGML policy work, but
that is probably easier to do once it has been applied. Barring any
objections I'll apply the patch within 24 hours.

-Neil



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Partitioning docs

2005-10-31 Thread Tom Lane
Neil Conway <[EMAIL PROTECTED]> writes:
> On Mon, 2005-31-10 at 22:41 +, Simon Riggs wrote:
>> I believe this is now complete and ready for application.

> The changes need a fair bit of copy editing and SGML policy work, but
> that is probably easier to do once it has been applied. Barring any
> objections I'll apply the patch within 24 hours.

I'd argue for editing first and then applying.  I'll take up the job
if you don't have time for the editing part... I'm hoping to spend most
of this week on docs editing anyway, since anything else will raise
Marc's hackles about whether we need another RC ;-)

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Partitioning docs

2005-10-31 Thread Neil Conway
On Mon, 2005-31-10 at 23:15 -0500, Tom Lane wrote:
> I'd argue for editing first and then applying.  I'll take up the job
> if you don't have time for the editing part 

Okay. I'll do a round of copy editing and then commit to CVS -- there
will likely be room for additional improvements, so once it's in CVS
anyone else who's interested can have at it.

-Neil



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] Partitioning docs

2005-10-31 Thread Simon Riggs
On Mon, 2005-10-31 at 23:27 -0500, Neil Conway wrote:
> On Mon, 2005-31-10 at 23:15 -0500, Tom Lane wrote:
> > I'd argue for editing first and then applying.  I'll take up the job
> > if you don't have time for the editing part 
> 
> Okay. I'll do a round of copy editing and then commit to CVS -- there
> will likely be room for additional improvements, so once it's in CVS
> anyone else who's interested can have at it.

Thanks guys.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] Partitioning docs

2005-11-01 Thread Neil Conway
On Mon, 2005-31-10 at 22:41 +, Simon Riggs wrote: 
> I believe this is now complete and ready for application.

Comments:

- INSERT, UPDATE, etc. should be marked with , unless 
would be more appropriate

- The names of GUC variables should be marked up with , unless
 would be more appropriate

-  tags that link to the reference page of an SQL command should
be of the form:  -- the
endterm attribute should not be omitted.

- "PostgreSQL" should be marked-up with 

- In text like "You can use RULEs to ...", "rules" would be better.

- The word following a colon should not be capitalized

- "—" is an em dash, "--" and "---" are not

- "indexes", not "indices"

- Why "Constraint Exclusion" (or worse, "the Constraint Exclusion
feature") rather than simply "constraint exclusion"? (I'm not even sure
it's a good idea to mention this term in end-user documentation.)

- I removed a few statements and paragraphs I thought were unnecessary
(e.g. Postgres was the first DBMS to have inheritance, some vague and
IMHO useless advice about query optimization differences with inherited
tables, etc.). Feel free to resubmit them if you disagree (although
perhaps not for 8.1.0).

+ All constraints on all partitions of the master table are considered
for
+ Constraint Exclusion, so large numbers of partitions are likely to 
+ increase query parse time considerably.

Wouldn't it primarily increase planning time, not parsing time?

+ 
+  CE only works when the query directly matches a constant. A
+  constant bound to a parameterised query will not work in the same way
+  since the plan is fixed and would need to vary with each execution.
+  Also, stable constants such as CURRENT_DATE may not be used, since
+  these are constant only for during the execution of a single query.
+  Joins conditions will not allow CE to work either.
+ 

I'm not sure what the last sentence is intended to mean.

Revised patch attached and applied. There are at least a few more things
that need cleaning up -- if no one beats me to it I'll do that shortly.

-Neil

Index: doc/src/sgml/config.sgml
===
RCS file: /home/neilc/postgres/cvs_root/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.33
diff -c -r1.33 config.sgml
*** doc/src/sgml/config.sgml	26 Oct 2005 12:55:07 -	1.33
--- doc/src/sgml/config.sgml	1 Nov 2005 22:54:16 -
***
*** 1974,1984 
 
  
 
! When this parameter is on, the planner compares query
! conditions with table CHECK constraints, and omits scanning tables
! where the conditions contradict the constraints.  (Presently
! this is done only for child tables of inheritance scans.)  For
! example:
  
  
  CREATE TABLE parent(key integer, ...);
--- 1974,1984 
 
  
 
! When this parameter is on, the planner compares
! query conditions with table CHECK constraints, and
! omits scanning tables where the conditions contradict the
! constraints.  (Presently this is done only for child tables of
! inheritance scans.)  For example:
  
  
  CREATE TABLE parent(key integer, ...);
***
*** 1988,2010 
  SELECT * FROM parent WHERE key = 2400;
  
  
! With constraint exclusion enabled, this SELECT will not scan
! child1000 at all.  This can improve performance when
! inheritance is used to build partitioned tables.
 
  
 
! Currently, constraint_exclusion defaults to
! off, because it risks incorrect results if
! query plans are cached --- if a table constraint is changed or dropped,
! the previously generated plan might now be wrong, and there is no
! built-in mechanism to force re-planning.  (This deficiency will
! probably be addressed in a future
! PostgreSQL release.)  Another reason
! for keeping it off is that the constraint checks are relatively
  expensive, and in many circumstances will yield no savings.
! It is recommended to turn this on only if you are actually using
! partitioned tables designed to take advantage of the feature.
 

   
--- 1988,2017 
  SELECT * FROM parent WHERE key = 2400;
  
  
! With constraint exclusion enabled, this SELECT
! will not scan child1000 at all.  This can
! improve performance when inheritance is used to build
! partitioned tables.
 
  
 
! Currently, constraint_exclusion is disabled by
! default because it risks incorrect results if query plans are
! cached — if a table constraint is changed or dropped,
! the previously generated plan might now be wrong, and there is
! no built-in mechanism to force re-planning.  (This deficiency
! will probably be addressed in a future
! PostgreSQL release.)  Another reason for
! keeping it off 

Re: [PATCHES] Partitioning docs

2005-11-02 Thread Simon Riggs
On Tue, 2005-11-01 at 18:19 -0500, Neil Conway wrote:
> On Mon, 2005-31-10 at 22:41 +, Simon Riggs wrote: 
> > I believe this is now complete and ready for application.
> 
> Comments:
> 
> - INSERT, UPDATE, etc. should be marked with , unless 
> would be more appropriate
> 
> - The names of GUC variables should be marked up with , unless
>  would be more appropriate
> 
> -  tags that link to the reference page of an SQL command should
> be of the form:  -- the
> endterm attribute should not be omitted.
> 
> - "PostgreSQL" should be marked-up with 
> 
> - In text like "You can use RULEs to ...", "rules" would be better.
> 
> - The word following a colon should not be capitalized
> 
> - "—" is an em dash, "--" and "---" are not
> 
> - "indexes", not "indices"

Thanks very much for a thorough review.

> - Why "Constraint Exclusion" (or worse, "the Constraint Exclusion
> feature") rather than simply "constraint exclusion"? 

OK

> (I'm not even sure
> it's a good idea to mention this term in end-user documentation.)

We now have a parameter called constraint_exclusion, so the term already
exists and so requires explanation. I would have had no objection to
modifications of that term, but it has been in use now for 4 months, so
changing it doesn't seem practical.

> - I removed a few statements and paragraphs I thought were unnecessary
> (e.g. Postgres was the first DBMS to have inheritance, some vague and
> IMHO useless advice about query optimization differences with inherited
> tables, etc.). Feel free to resubmit them if you disagree (although
> perhaps not for 8.1.0).

Trying to identify which bit of advice you refer to I put some
comments in based upon feedback from the beta on specific queries that
were not optimised the same as non-inherited tables. If thats what
you're talking about, then I'd like to put that back. The manuals aren't
written for you and me; why let others stumble when they could have it
in black and white?

> + All constraints on all partitions of the master table are considered
> for
> + Constraint Exclusion, so large numbers of partitions are likely to 
> + increase query parse time considerably.
> 
> Wouldn't it primarily increase planning time, not parsing time?

Yes. What generic term would you use for query compilation? query
preparation? The distinction of parsing/planning/optimization etc is
lost on most people.

> + 
> +  CE only works when the query directly matches a constant. A
> +  constant bound to a parameterised query will not work in the same way
> +  since the plan is fixed and would need to vary with each execution.
> +  Also, stable constants such as CURRENT_DATE may not be used, since
> +  these are constant only for during the execution of a single query.
> +  Joins conditions will not allow CE to work either.
> + 
> 
> I'm not sure what the last sentence is intended to mean.

OK, I'll work on a longer explanation of that.

> Revised patch attached and applied. There are at least a few more things
> that need cleaning up -- if no one beats me to it I'll do that shortly.

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Partitioning docs

2005-11-02 Thread Neil Conway
On Wed, 2005-02-11 at 19:55 +, Simon Riggs wrote:
> Trying to identify which bit of advice you refer to I put some
> comments in based upon feedback from the beta on specific queries that
> were not optimised the same as non-inherited tables.

ISTM that query optimization *always* works differently for inherited
versus non-inherited tables, so there are a wide variety of queries you
could describe like that.

The other problem is the documentation is sufficiently vague that it is
of little use, IMHO. Simply saying "query X is optimized differently"
without explaining what causes the difference, what the performance
impact is likely to be, or how to workaround the problem isn't likely to
be very helpful.

-Neil



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org