Re: [HACKERS] adminpack and pg_catalog

2006-11-06 Thread Dave Page

Neil Conway wrote:

On Fri, 2006-10-20 at 22:59 +0200, Peter Eisentraut wrote:
Nothing except initdb should add objects in pg_catalog.  AFAICS, 
adminpack doesn't have any special requirements, so it should behave 
like all other contrib modules.


Where are we on this? When this topic was last discussed, the three
alternatives were:

>

(1) Modify contrib/adminpack to not use the pg_catalog schema,
per the consensus that contrib/ packages installing objects
into that schema is broken behavior

(2) Don't modify contrib/adminpack, for the sake of backward
compatibility

(3) Remove contrib/adminpack from the Postgres distribution

I think the discussion was edging toward #3, but #2 is the only option
that I'm not happy with. Any other opinions out there?


Looking back over the thread, it appears that only you and Peter 
objected to it as it is now. Tom, Andreas and myself were of the opinion 
it was fine as it is, and whilst he didn't comment on how it should be 
implemented, Simon made the point that supporting admin tools from the 
core distribution was important which I take to mean he is against #3.


Regards, Dave.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] [PATCHES] Bug in WAL backup documentation

2006-11-06 Thread Simon Riggs
On Sun, 2006-11-05 at 15:02 +, Simon Riggs wrote:

> Code comments now discuss relative paths also.

Patch containing just the minor cleanup of docs and code comments.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com

Index: doc/src/sgml/backup.sgml
===
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/backup.sgml,v
retrieving revision 2.93
diff -c -r2.93 backup.sgml
*** doc/src/sgml/backup.sgml	4 Nov 2006 18:20:27 -	2.93
--- doc/src/sgml/backup.sgml	6 Nov 2006 08:21:22 -
***
*** 599,605 
  In writing your archive command, you should assume that the file names to
  be archived may be up to 64 characters long and may contain any
  combination of ASCII letters, digits, and dots.  It is not necessary to
! remember the original full path (%p) but it is necessary to
  remember the file name (%f).
 
  
--- 599,605 
  In writing your archive command, you should assume that the file names to
  be archived may be up to 64 characters long and may contain any
  combination of ASCII letters, digits, and dots.  It is not necessary to
! remember the original relative path (%p) but it is necessary to
  remember the file name (%f).
 
  
Index: src/backend/access/transam/xlog.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.252
diff -c -r1.252 xlog.c
*** src/backend/access/transam/xlog.c	18 Oct 2006 22:44:11 -	1.252
--- src/backend/access/transam/xlog.c	6 Nov 2006 08:21:31 -
***
*** 2417,2423 
  			switch (sp[1])
  			{
  case 'p':
! 	/* %p: full path of target file */
  	sp++;
  	StrNCpy(dp, xlogpath, endp - dp);
  	make_native_path(dp);
--- 2417,2423 
  			switch (sp[1])
  			{
  case 'p':
! 	/* %p: relative path of target file */
  	sp++;
  	StrNCpy(dp, xlogpath, endp - dp);
  	make_native_path(dp);
Index: src/backend/postmaster/pgarch.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/postmaster/pgarch.c,v
retrieving revision 1.25
diff -c -r1.25 pgarch.c
*** src/backend/postmaster/pgarch.c	7 Aug 2006 17:41:42 -	1.25
--- src/backend/postmaster/pgarch.c	6 Nov 2006 08:21:33 -
***
*** 417,423 
  			switch (sp[1])
  			{
  case 'p':
! 	/* %p: full path of source file */
  	sp++;
  	StrNCpy(dp, pathname, endp - dp);
  	make_native_path(dp);
--- 417,423 
  			switch (sp[1])
  			{
  case 'p':
! 	/* %p: relative path of source file */
  	sp++;
  	StrNCpy(dp, pathname, endp - dp);
  	make_native_path(dp);

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] WITH SYSID dropped

2006-11-06 Thread Dr. Ernst Molitor
Dear Stefan Kaltenbrunner, 

dear PostgreSQL gurus, 

current_user returns a "name" (currently, 64 bytes of data); I felt it would 
be contrary to good database practice to repetitiously include (up to) 64 
bytes of data in each and every row of each and every table instead of making 
use of what relational databases are about and storing just enough data 
(e.g., the usesysid) to make the relation clear. 

My assumption was that the PostgreSQL extension to the standard language 
(the "WITH SYSID" option of the CREATE USER command, which in itself is still 
included in the grammar to allow reading dumps created with pre-8.2.x 
versions of PostgreSQL) was here to stay. 

I stand up to my wrong decision - it's just my fault - but still hope for a 
hint on how to minimize its consequences on my applications (screenshots of 
one of them are at http://mibi03.meb.uni-bonn.de/demo/en/ ). Maybe I face no 
better option than to store the old database's usesysid's and the 
corresponding user names in a table before migrating to 8.2.x and live with a 
break (or, for future updates, with some breaks ...) in the 
usename-usesysid-relation.

Best wishes and regards, 

Ernst


On Sunday,  November 5th, 2006,  14:52 Stefan Kaltenbrunner wrote
> Dr. Ernst Molitor wrote:
> > Dear PostgreSQL gurus,
> >
> >
> > The procedure still works, but due to the dropping of "WITH SYSID",
> > moving the data to a new PostgreSQL installation causes headaches. After
> > all, the access to the information about who has changed what is
> > expected to stay.
>
> maybe I'm missing something simple - but why are you not using
> current_user/session_user
> (http://www.postgresql.org/docs/current/interactive/functions-info.html)
> for that task - that seems much less error-prone then using an arbitrary
> number to reference a given database role ...

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Indicate disabled triggers in \d

2006-11-06 Thread Andrew Dunstan

Tom Lane wrote:

"Brendan Jurd" <[EMAIL PROTECTED]> writes:
  

My first impulse was to just append a " (disabled)" after each
disabled trigger, but perhaps that is not visually obvious enough,
especially if the table has many triggers on it.



Agreed, but maybe put it up at the front?

Triggers:
y AFTER DELETE ON x FOR EACH ROW EXECUTE PROCEDURE do_something()
z (disabled) BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE input_stuff()


  


+1. This bit me recently. Or maybe, in the interests of preserving 
screen space, a [*] for enabled and [x] for disabled, or something similar.


cheers

andrew


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Is python 2.5 supported?

2006-11-06 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes:
> Jim C. Nasby wrote:
>> Since installing python 2.5, tapir has been failing:

> I have removed the use of the deprecated whrandom module, which should 
> take care of one regression test failure, but after that I get

> *** glibc detected *** free(): invalid pointer: 0xa5df6e78 ***
> LOG:  server process (PID 1720) was terminated by signal 6

A bit of study of the Python documentation suggests that plpython.c is
taking a whole lot of shortcuts in setting up its Python type objects.
For me, this crash goes away with the attached patch ... but I don't
know if it might fail with old Python versions.

There are still a number of regression test diffs, which all look like
Python 2.5 has unilaterally changed the format of object descriptions, eg
! DETAIL:  plpy.SPIError: Unknown error in PLy_spi_prepare
vs
! DETAIL:  : Unknown error in PLy_spi_prepare
Probably there's no help for that except to have two expected files.

regards, tom lane

Index: plpython.c
===
RCS file: /cvsroot/pgsql/src/pl/plpython/plpython.c,v
retrieving revision 1.89
diff -c -r1.89 plpython.c
*** plpython.c  4 Oct 2006 00:30:14 -   1.89
--- plpython.c  6 Nov 2006 15:45:58 -
***
*** 1981,1987 
0,  /* tp_getattro 
*/
0,  /* tp_setattro 
*/
0,  /* tp_as_buffer 
*/
!   0,  /* tp_xxx4 */
PLy_plan_doc,   /* tp_doc */
  };
  
--- 1981,1987 
0,  /* tp_getattro 
*/
0,  /* tp_setattro 
*/
0,  /* tp_as_buffer 
*/
!   Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,   /* tp_flags */
PLy_plan_doc,   /* tp_doc */
  };
  
***
*** 2026,2032 
0,  /* tp_getattro 
*/
0,  /* tp_setattro 
*/
0,  /* tp_as_buffer 
*/
!   0,  /* tp_xxx4 */
PLy_result_doc, /* tp_doc */
  };
  
--- 2026,2032 
0,  /* tp_getattro 
*/
0,  /* tp_setattro 
*/
0,  /* tp_as_buffer 
*/
!   Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,   /* tp_flags */
PLy_result_doc, /* tp_doc */
  };
  
***
*** 2098,2104 
PLy_free(ob->args);
}
  
!   PyMem_DEL(arg);
  }
  
  
--- 2098,2104 
PLy_free(ob->args);
}
  
!   arg->ob_type->tp_free(arg);
  }
  
  
***
*** 2152,2158 
Py_XDECREF(ob->rows);
Py_XDECREF(ob->status);
  
!   PyMem_DEL(ob);
  }
  
  static PyObject *
--- 2152,2158 
Py_XDECREF(ob->rows);
Py_XDECREF(ob->status);
  
!   arg->ob_type->tp_free(arg);
  }
  
  static PyObject *
***
*** 2701,2707 
/*
 * initialize plpy module
 */
!   PLy_PlanType.ob_type = PLy_ResultType.ob_type = &PyType_Type;
plpy = Py_InitModule("plpy", PLy_methods);
plpy_dict = PyModule_GetDict(plpy);
  
--- 2701,2711 
/*
 * initialize plpy module
 */
!   if (PyType_Ready(&PLy_PlanType) < 0)
!   elog(ERROR, "could not init PLy_PlanType");
!   if (PyType_Ready(&PLy_ResultType) < 0)
!   elog(ERROR, "could not init PLy_ResultType");
! 
plpy = Py_InitModule("plpy", PLy_methods);
plpy_dict = PyModule_GetDict(plpy);
  

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] adminpack and pg_catalog

2006-11-06 Thread Simon Riggs
On Mon, 2006-11-06 at 09:02 +, Dave Page wrote:
> Neil Conway wrote:
> > On Fri, 2006-10-20 at 22:59 +0200, Peter Eisentraut wrote:
> >> Nothing except initdb should add objects in pg_catalog.  AFAICS, 
> >> adminpack doesn't have any special requirements, so it should behave 
> >> like all other contrib modules.
> > 
> > Where are we on this? When this topic was last discussed, the three
> > alternatives were:
>  >
> > (1) Modify contrib/adminpack to not use the pg_catalog schema,
> > per the consensus that contrib/ packages installing objects
> > into that schema is broken behavior
> > 
> > (2) Don't modify contrib/adminpack, for the sake of backward
> > compatibility
> > 
> > (3) Remove contrib/adminpack from the Postgres distribution
> > 
> > I think the discussion was edging toward #3, but #2 is the only option
> > that I'm not happy with. Any other opinions out there?
> 
> Looking back over the thread, it appears that only you and Peter 
> objected to it as it is now. Tom, Andreas and myself were of the opinion 
> it was fine as it is, and whilst he didn't comment on how it should be 
> implemented, Simon made the point that supporting admin tools from the 
> core distribution was important which I take to mean he is against #3.

Definitely against #3. [Argument: not just pgAdmin, essential feature]

At the moment we only allow 2 types of table. Approved core catalog
tables and user tables.

ISTM we need 3 types of tables, with the additional type being add-on
system functionality, such as adminpack, but also possibly including
performance logging tables, extended trace tools, security packages
(e.g. Veil) etc etc. We have made provision for user add-ons to allocate
LWlocks and shared memory, but no provision for additional tables.
Surely we must be able to accommodate things as simple as additional
tables?

So we would then have (change names as appropriate)
- core catalog
- catalog extensions
- user tables.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] adminpack and pg_catalog

2006-11-06 Thread Tom Lane
"Simon Riggs" <[EMAIL PROTECTED]> writes:
> At the moment we only allow 2 types of table. Approved core catalog
> tables and user tables.

> ISTM we need 3 types of tables, with the additional type being add-on
> system functionality, such as adminpack,

What?  The adminpack module only creates functions.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] cvs 'initdb' -- "vacuuming database template1 ... FATAL: could not identify a comparison function for type aclitem"

2006-11-06 Thread Matt Miller
head does this to me when I try to initdb:

[EMAIL PROTECTED]:~$ /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale en_US.UTF-8.
The default database encoding has accordingly been set to UTF8.

fixing permissions on existing directory /usr/local/pgsql/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers/max_fsm_pages ... 32MB/204800
creating configuration files ... ok
creating template1 database in /usr/local/pgsql/data/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... ok
creating conversions ... ok
setting privileges on built-in objects ... ok
creating information schema ... ok
vacuuming database template1 ... FATAL:  could not identify a comparison 
function for type aclitem
child process exited with exit code 1
initdb: removing contents of data directory "/usr/local/pgsql/data"

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] cvs 'initdb' -- "vacuuming database template1 ... FATAL: could not identify a comparison function for type aclitem"

2006-11-06 Thread Tom Lane
"Matt Miller" <[EMAIL PROTECTED]> writes:
> head does this to me when I try to initdb:

I bet you didn't do a full recompile after "cvs update".
If you're not using --enable-depend then you really have
to do "make clean" or even "make distclean".

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] adminpack and pg_catalog

2006-11-06 Thread Simon Riggs
On Mon, 2006-11-06 at 13:37 -0500, Tom Lane wrote:
> "Simon Riggs" <[EMAIL PROTECTED]> writes:
> > At the moment we only allow 2 types of table. Approved core catalog
> > tables and user tables.
> 
> > ISTM we need 3 types of tables, with the additional type being add-on
> > system functionality, such as adminpack,
> 
> What?  The adminpack module only creates functions.

AFAIK the namespace is the issue, not the type of object.

As I mentioned, we can add LWlock extensions but we don't have an
official home for other database objects.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] cvs 'initdb' -- "vacuuming database template1 ... FATAL:

2006-11-06 Thread Matt Miller
> > head does this to me when I try to initdb:
> 
> I bet you didn't do a full recompile after "cvs update".
> If you're not using --enable-depend then you really have
> to do "make clean" or even "make distclean".

I am using --enable-depend, but I'll "make clean" and give
it another shot.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] cvs 'initdb' -- "vacuuming database template1 ... FATAL:

2006-11-06 Thread Matt Miller
> > > head does this to me when I try to initdb:
> >  ...
> > do "make clean" or even "make distclean".
> 
> I am using --enable-depend, but I'll "make clean" and give
> it another shot.

All better.  Thanks.

I guess I be suspicious of --enable-depend for a while.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] cvs 'initdb' -- "vacuuming database template1 ... FATAL:

2006-11-06 Thread Matt Miller
> > head does this to me when I try to initdb:
> 
> I bet you didn't do a full recompile after "cvs update".
> If you're not using --enable-depend then you really have
> to do "make clean" or even "make distclean".

I am using --enable-depend, but I'll 'make clean' and give
it another shot.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Tsearch Index Size and GiST vs. GIN

2006-11-06 Thread Richard Whidden
Due to popular demand (1 person), I've compared sizes with 90 and 100 
fillfactors, along with using the new GIN index.


Findings were not surprising, except for the GIN indexes, which doubled in 
size.


After several ALTER/RE INDEXes, here they are:

GiST

8.1 = 94990
8.2 FF90 = 106244 relpages (8k)
8.2 FF100 = 95049

GIN
---
FF100 = 197702


Richard Whidden

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Introducing an advanced Frequent Update Optimization

2006-11-06 Thread Simon Riggs
EnterpriseDB has been running a research project to improve the
performance of heavily updated tables. We have a number of approaches
prototyped and we'd like to discuss the best of these now on -hackers
for community input and patch submission to PostgreSQL core.

The most important step with any proposal is to agree that we have an
issue that needs improvement, discuss how widespread that issue is and
find some clear test cases that show up the problems. Tests are:

1. pgbench reveals performance that will degrade over a long period.

2. DBT-2 reveals performance that will degrade over a long period. Many
tests over a 2 hour period don't fully show this, especially when the
test is cafeully tuned.

3. Some common scenarios in applications are where some rows of a table
are "hot" from being constantly updated, while others are not. An
example of such a test case is the truckin' test, included here. It's
based directly on a specific customer application, but its been
generalised to make sure the underlying design pattern is clear.

These tests reveal the following issues, all of which are known:

- update performs inserts into indexes, as well as into heap blocks

- VACUUM can remove heap blocks easily, but performs much worse on
indexes, making VACUUM a less good solution. We have now been able to
speed up index VACUUM, but this require us to scan the whole index for
correct locking. VACUUM scans the whole table, whereas dead rows may
well be localised. Heap-needs-vacuum-bitmap has been proposed here, but
no solution currently exists for vacuuming only parts of indexes and so
proposals for concurrent vacuums are now being considered.

- indexes that have been stretched apart by updates do not ever coalesce
again and require regular REINDEX, which is not yet possible
concurrently; the contention caused by this would be catastrophic for
performance, even if anybody knew of a way to do this concurrently.

- There are specific issues with the optimizer's ability to understand 
dead row numbers, which can in some cases lead to SeqScan plans that are
inappropriate when tables grow because of updates. This is a red-herring
that can lead to people thinking the situation is worse than it is; that
needs fixing, but the core issues mentioned above remain.

To alleviate these problems we've added features such as WITH fillfactor
for heaps and table-level autovacuum tuning. Tuning all of these
features to good effect is an art form that is beyond the reasonable for
most users. Many internal optimizations have been made in this area and
as can be seen, many are still required to achieve better performance.

The proposal about to be made takes a more radical approach and
re-examines the architecture of the heap, to allow us to consider much
faster designs for heavy UPDATEs. Although initially radical, the
proposal appears to be fully MVCC correct, crash safe as well as being
much faster under heavy updates, while approximately neutral in other
cases with no major downsides.

Why should we care? The UPDATE case has obvious use-cases in a business
design pattern I'll call CustomerAccountDebit which is pervasive in
pay-per-use websites, banks, telephone companies, road traffic
monitoring etc etc. It's also pervasive in Data Warehousing where
summary tables/materialized views are regularly updated to maintain a
current picture of spending, movements or any other accumulation of
event detail. It's everywhere, basically.

Your various viewpoints on the above are welcome, but assuming for the
moment that you agree so far, we can move towards the proposal...

These discussions will likely be lengthy if taken seriously and need to
cover a range of different topics to ensure we cover what we know and
ensure we listen to all the feedback everybody gives. To that end, I'd
like to introduce two colleagues of mine to the community, Pavan
Deolasee and Nikhil Sontakke who have been working hard on developing
the prototypes and measuring/tuning them respectively. 

I would stress that we are not bringing our first prototype to the
table, but actually design #5. We think you'll be interested, but we
won't take that for granted.

Our next steps will be to

- discuss various other approaches to the problem, and why we are now
proposing one specific approach and receive "why dont we..." feedback
and additional ideas (Simon)

- discuss the proposal in technical depth, explain the challenges that
remain and ask for feedback and input on those, with specific regard to
low-level coding (Pavan)

- present details of performance testing done so far (Nikhil)

- explain the measures we have taken to prove the correctness of our
approach for MVCC, crash safety and PITR (Simon)

Each of these areas will be started as a separate thread of discussion
on -hackers, to allow us to stay focused on those topics.

But before we do that, any comments on the above?

---

The truckin test case included here consists of a complex update
function that is executed

Re: [HACKERS] Index ignored with "is not distinct from", 8.2 beta2

2006-11-06 Thread Jim C. Nasby
One issue is that I'm not sure think you've got your sugar quite right.
Have you tested with:

(col IS NOT NULL AND 123 IS NOT NULL AND col = 123) OR
(col IS NULL and 123 IS NULL) ?

It's possible that the planner doesn't know about using an index for
DISTINCT; or it might just want an index that's defined WHERE col IS NOT
NULL.

On Wed, Nov 01, 2006 at 03:29:33PM -0500, JEAN-PIERRE PELLETIER wrote:
> I've reposted this from pgsql-performance where I got no response.
> 
> ==
> 
> Hi,
> 
> I wanted to use "exp1 is not distinct from exp2" which I tough was syntaxic 
> sugar for
> exp1 is not null and exp2 is not null and exp1 = exp2 or exp1 is null and 
> exp2 is null
> but my index is ignored with "is not distinct from".
> 
> Is this the expected behavior ?
> 
> create temporary table t as select * from generate_series(1,100) t(col);
> create unique index i on t(col);
> analyze t;
> 
> -- These queries don't use the index
> select count(*) from t where col is not distinct from 123;
> select count(*) from t where not col is distinct from 123;
> 
> -- This query use the index
> select count(*) from t where col is not null and 123 is not null and col = 
> 123 or col is null and 123 is null;
> 
> explain analyze select count(*) from t where col is not distinct from 123;
> QUERY PLAN
> 
> Aggregate  (cost=19154.79..19154.80 rows=1 width=0) (actual 
> time=228.200..228.202 rows=1 loops=1)
>   ->  Seq Scan on t  (cost=0.00..17904.90 rows=499956 width=0) (actual 
> time=0.042..228.133 rows=1 loops=1)
> Filter: (NOT (col IS DISTINCT FROM 123))
> Total runtime: 228.290 ms
> (4 rows)
> Time: 219.000 ms
> 
> explain analyze select count(*) from t where not col is distinct from 123;
> QUERY PLAN
> 
> Aggregate  (cost=19154.79..19154.80 rows=1 width=0) (actual 
> time=235.950..235.952 rows=1 loops=1)
>   ->  Seq Scan on t  (cost=0.00..17904.90 rows=499956 width=0) (actual 
> time=0.040..235.909 rows=1 loops=1)
> Filter: (NOT (col IS DISTINCT FROM 123))
> Total runtime: 236.065 ms
> (4 rows)
> Time: 250.000 ms
> 
> explain analyze select count(*) from t where col is not null and 123 is not 
> null and col = 123 or col is null and 123 is null;
>QUERY PLAN
> ---
> Aggregate  (cost=8.13..8.14 rows=1 width=0) (actual time=0.267..0.268 
> rows=1 loops=1)
>   ->  Index Scan using i on t  (cost=0.00..8.13 rows=1 width=0) (actual 
> time=0.237..0.241 rows=1 loops=1)
> Index Cond: (col = 123)
> Total runtime: 0.366 ms
> (4 rows)
> Time: 0.000 ms
> 
> I am on Windows XP Service pack 2 with PostgreSQL 8.2 beta2
> 
> Thanks,
> Jean-Pierre Pelletier
> e-djuster
> 
> 
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>   http://archives.postgresql.org
> 

-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] Direct Client Req from ProV-----PostGre Database Developer at San Diego, CA

2006-11-06 Thread Andrew




Hi,

 
This is Andrew with ProV International.
 
I have 
direct client requirement for the position of PostGre Database Developer 
at San Diego, CA. For you reference given below is the 
requirement. If you find this interesting and matching to skills then 
please send across your updated resume in word format ASAP. If you have any 
query then please let me know. 
 
Position Title:  PostGre Database 
DeveloperLocation: San Diego, CADuration: 6 
MonthResponsibilities:This position involves creating 
tables, views, functions and stored procedures to support front end OLTP and 
reporting applications.  The ideal developer will have thorough knowledge 
of SQL (PL/pgSQL), experience with atleast one other PostgreSQL language 
(e.g. PL/Perl), and extensive experience with complex stored procedures, code 
optimization, and index tuning in PostgreSQL.Skills 
Required:5+ years database development with PostgreSQLKnowledge of 
at least one other language in addition to PL/pgSQL, such as PL/Perl or 
PL/Java.Experience implementing PostgreSQL replication using Slony-I. 

Some 
experience with either SQL Server 2000 or Oracle 9i/10g.Significant 
background in creating complex stored procedures and SQL 
scriptsUnderstanding of database normalization concepts. Some experience in 
logical and physical database design andimplementationPrior experience 
working in a project oriented environment and meeting deadlines under tight time 
constraints
 
Thanks 
and Regards,AndrewProV InternationalPh: 408-241-7795 ext 
40[EMAIL PROTECTED]www.provintl.com


Re: [HACKERS] Indicate disabled triggers in \d

2006-11-06 Thread David Fetter
On Mon, Nov 06, 2006 at 09:12:32AM -0500, Andrew Dunstan wrote:
> Tom Lane wrote:
> >"Brendan Jurd" <[EMAIL PROTECTED]> writes:
> >  
> >>My first impulse was to just append a " (disabled)" after each
> >>disabled trigger, but perhaps that is not visually obvious enough,
> >>especially if the table has many triggers on it.
> >>
> >
> >Agreed, but maybe put it up at the front?
> >
> >Triggers:
> >y AFTER DELETE ON x FOR EACH ROW EXECUTE PROCEDURE do_something()
> >z (disabled) BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE 
> >input_stuff()
> 
> +1. This bit me recently. Or maybe, in the interests of preserving
> screen space, a [*] for enabled and [x] for disabled, or something
> similar.

For this case, I think clarity is more important than saving screen
space.

Cheers,
D
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Introducing an advanced Frequent Update

2006-11-06 Thread Luke Lonergan
Simon,

Bring it on!  We at GP have been evaluating various approaches to index
organized tables which unify index with heap storage to solve some of
the problems you mention.  Split index and heap is a big issue in
Postgres and we'd all welcome a good solution to it, even for limited
circumstances like single index organization or the like.

- Luke 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Simon Riggs
> Sent: Monday, November 06, 2006 1:51 PM
> To: pgsql-hackers@postgresql.org
> Subject: [HACKERS] Introducing an advanced Frequent Update 
> Optimization
> 
> EnterpriseDB has been running a research project to improve 
> the performance of heavily updated tables. We have a number 
> of approaches prototyped and we'd like to discuss the best of 
> these now on -hackers for community input and patch 
> submission to PostgreSQL core.
> 
> The most important step with any proposal is to agree that we 
> have an issue that needs improvement, discuss how widespread 
> that issue is and find some clear test cases that show up the 
> problems. Tests are:
> 
> 1. pgbench reveals performance that will degrade over a long period.
> 
> 2. DBT-2 reveals performance that will degrade over a long 
> period. Many tests over a 2 hour period don't fully show 
> this, especially when the test is cafeully tuned.
> 
> 3. Some common scenarios in applications are where some rows 
> of a table are "hot" from being constantly updated, while 
> others are not. An example of such a test case is the 
> truckin' test, included here. It's based directly on a 
> specific customer application, but its been generalised to 
> make sure the underlying design pattern is clear.
> 
> These tests reveal the following issues, all of which are known:
> 
> - update performs inserts into indexes, as well as into heap blocks
> 
> - VACUUM can remove heap blocks easily, but performs much 
> worse on indexes, making VACUUM a less good solution. We have 
> now been able to speed up index VACUUM, but this require us 
> to scan the whole index for correct locking. VACUUM scans the 
> whole table, whereas dead rows may well be localised. 
> Heap-needs-vacuum-bitmap has been proposed here, but no 
> solution currently exists for vacuuming only parts of indexes 
> and so proposals for concurrent vacuums are now being considered.
> 
> - indexes that have been stretched apart by updates do not 
> ever coalesce again and require regular REINDEX, which is not 
> yet possible concurrently; the contention caused by this 
> would be catastrophic for performance, even if anybody knew 
> of a way to do this concurrently.
> 
> - There are specific issues with the optimizer's ability to 
> understand dead row numbers, which can in some cases lead to 
> SeqScan plans that are inappropriate when tables grow because 
> of updates. This is a red-herring that can lead to people 
> thinking the situation is worse than it is; that needs 
> fixing, but the core issues mentioned above remain.
> 
> To alleviate these problems we've added features such as WITH 
> fillfactor for heaps and table-level autovacuum tuning. 
> Tuning all of these features to good effect is an art form 
> that is beyond the reasonable for most users. Many internal 
> optimizations have been made in this area and as can be seen, 
> many are still required to achieve better performance.
> 
> The proposal about to be made takes a more radical approach 
> and re-examines the architecture of the heap, to allow us to 
> consider much faster designs for heavy UPDATEs. Although 
> initially radical, the proposal appears to be fully MVCC 
> correct, crash safe as well as being much faster under heavy 
> updates, while approximately neutral in other cases with no 
> major downsides.
> 
> Why should we care? The UPDATE case has obvious use-cases in 
> a business design pattern I'll call CustomerAccountDebit 
> which is pervasive in pay-per-use websites, banks, telephone 
> companies, road traffic monitoring etc etc. It's also 
> pervasive in Data Warehousing where summary 
> tables/materialized views are regularly updated to maintain a 
> current picture of spending, movements or any other 
> accumulation of event detail. It's everywhere, basically.
> 
> Your various viewpoints on the above are welcome, but 
> assuming for the moment that you agree so far, we can move 
> towards the proposal...
> 
> These discussions will likely be lengthy if taken seriously 
> and need to cover a range of different topics to ensure we 
> cover what we know and ensure we listen to all the feedback 
> everybody gives. To that end, I'd like to introduce two 
> colleagues of mine to the community, Pavan Deolasee and 
> Nikhil Sontakke who have been working hard on developing the 
> prototypes and measuring/tuning them respectively. 
> 
> I would stress that we are not bringing our first prototype 
> to the table, but actually design #5. We think you'll be 
> interested, but we

Re: [HACKERS] Direct Client Req from ProV-----

2006-11-06 Thread Josh Berkus
Andrew,

> I have direct client requirement for the position of PostGre Database
> Developer at San Diego, CA. For you reference given below is the
> requirement. If you find this interesting and matching to skills then
> please send across your updated resume in word format ASAP. If you have
> any query then please let me know.

Please post this to [EMAIL PROTECTED], NOT here.  Thanks.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] NULL in arrays

2006-11-06 Thread Jim C. Nasby
On Sun, Nov 05, 2006 at 09:53:08PM +0100, Martijn van Oosterhout wrote:
> Note that the constructs:
> 
> ARRAY['a',"NULL",'c']
> 
> and
> 
> '{a,"NULL",c}'
> 
> are *completely* different. The first is a special array constructor
> and all its parameters are normal SQL expressions, so you can reference
> columns and use NULL directly without quotes. The latter is the string
> value of the array, which is specially decoded. Thats why the latter
> treats the double quotes differently.

This could probably be expanded on in the docs... mentioning the
difference in the NULLs section would be a good start. IE:

To set an element of an array constant to NULL, write NULL  for the
element value. (Any upper- or lower-case variant of NULL will do.) If
you want an actual string value "NULL", you must put double quotes
around it. Note that if you use the ARRAY construct you should just use
a bareword NULL instead.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Introducing an advanced Frequent Update Optimization

2006-11-06 Thread Mark Kirkwood

Simon Riggs wrote:

EnterpriseDB has been running a research project to improve the
performance of heavily updated tables. We have a number of approaches
prototyped and we'd like to discuss the best of these now on -hackers
for community input and patch submission to PostgreSQL core.



Excellent! It would certainly be good for use cases like:

- session data from web (or similar) applications
- real time summary tables maintained by triggers

to "just work", as (certainly in the case of the first one) quite a few 
folks have been bitten by exactly the issue you describe.


best wishes

Mark

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Index ignored with "is not distinct from", 8.2 beta2

2006-11-06 Thread JEAN-PIERRE PELLETIER

This shows all three forms to be equivalent.

SELECT
  exp1,
  exp2,
  exp1 IS NOT DISTINCT FROM exp2 AS isnotdistinct,
  exp1 is not null and exp2 is not null and exp1 = exp2 or exp1 is null and 
exp2 is null AS JP,
  (exp1 is not null and exp2 is not null and exp1 = exp2) or (exp1 is null 
and exp2 is null) AS Jim

FROM
  (SELECT 1 AS exp1, 1 AS exp2
  UNION ALL SELECT 1, 2
  UNION ALL SELECT 1,NULL
  UNION ALL SELECT NULL,1
  UNION ALL SELECT NULL,NULL) Q;

I understand that the planner doesn't use indexes for IS NOT DISTINCT FROM, 
but it would
be good because "is not distinct from" is very useful when you have nulls 
but don't want to use three value logic.


null = null => true
null = not null => false

I don't think it is that uncommon and even some SQL constructs such as 
"select distinct" or "group by" compare null that way.


I'll wait before using IS NOT DISTINCT FROM and stick with the equivalent 
longer forms which

use indexes.

8.2 is better than 8.1  which was not picking up indexes even with the 
longer forms.


Jean-Pierre Pelletier


From: "Jim C. Nasby" <[EMAIL PROTECTED]>
To: JEAN-PIERRE PELLETIER <[EMAIL PROTECTED]>
CC: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Index ignored with "is not distinct from", 8.2 beta2
Date: Mon, 6 Nov 2006 16:02:40 -0600

One issue is that I'm not sure think you've got your sugar quite right.
Have you tested with:

(col IS NOT NULL AND 123 IS NOT NULL AND col = 123) OR
(col IS NULL and 123 IS NULL) ?

It's possible that the planner doesn't know about using an index for
DISTINCT; or it might just want an index that's defined WHERE col IS NOT
NULL.

On Wed, Nov 01, 2006 at 03:29:33PM -0500, JEAN-PIERRE PELLETIER wrote:
> I've reposted this from pgsql-performance where I got no response.
>
> ==
>
> Hi,
>
> I wanted to use "exp1 is not distinct from exp2" which I tough was 
syntaxic

> sugar for
> exp1 is not null and exp2 is not null and exp1 = exp2 or exp1 is null 
and

> exp2 is null
> but my index is ignored with "is not distinct from".
>
> Is this the expected behavior ?
>
> create temporary table t as select * from generate_series(1,100) 
t(col);

> create unique index i on t(col);
> analyze t;
>
> -- These queries don't use the index
> select count(*) from t where col is not distinct from 123;
> select count(*) from t where not col is distinct from 123;
>
> -- This query use the index
> select count(*) from t where col is not null and 123 is not null and col 
=

> 123 or col is null and 123 is null;
>
> explain analyze select count(*) from t where col is not distinct from 
123;

> QUERY PLAN
> 


> Aggregate  (cost=19154.79..19154.80 rows=1 width=0) (actual
> time=228.200..228.202 rows=1 loops=1)
>   ->  Seq Scan on t  (cost=0.00..17904.90 rows=499956 width=0) (actual
> time=0.042..228.133 rows=1 loops=1)
> Filter: (NOT (col IS DISTINCT FROM 123))
> Total runtime: 228.290 ms
> (4 rows)
> Time: 219.000 ms
>
> explain analyze select count(*) from t where not col is distinct from 
123;

> QUERY PLAN
> 


> Aggregate  (cost=19154.79..19154.80 rows=1 width=0) (actual
> time=235.950..235.952 rows=1 loops=1)
>   ->  Seq Scan on t  (cost=0.00..17904.90 rows=499956 width=0) (actual
> time=0.040..235.909 rows=1 loops=1)
> Filter: (NOT (col IS DISTINCT FROM 123))
> Total runtime: 236.065 ms
> (4 rows)
> Time: 250.000 ms
>
> explain analyze select count(*) from t where col is not null and 123 is 
not

> null and col = 123 or col is null and 123 is null;
>QUERY PLAN
> 
---

> Aggregate  (cost=8.13..8.14 rows=1 width=0) (actual time=0.267..0.268
> rows=1 loops=1)
>   ->  Index Scan using i on t  (cost=0.00..8.13 rows=1 width=0) (actual
> time=0.237..0.241 rows=1 loops=1)
> Index Cond: (col = 123)
> Total runtime: 0.366 ms
> (4 rows)
> Time: 0.000 ms
>
> I am on Windows XP Service pack 2 with PostgreSQL 8.2 beta2
>
> Thanks,
> Jean-Pierre Pelletier
> e-djuster
>
>
>
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
>
>   http://archives.postgresql.org
>

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Introducing an advanced Frequent Update Optimization

2006-11-06 Thread ITAGAKI Takahiro
"Simon Riggs" <[EMAIL PROTECTED]> wrote:

> EnterpriseDB has been running a research project to improve the
> performance of heavily updated tables. We have a number of approaches
> prototyped and we'd like to discuss the best of these now on -hackers
> for community input and patch submission to PostgreSQL core.

I'm very interested in your proposal! NTT is also working for OLTP workloads,
especially on improvements of VACUUM. Maybe we have similar problems.


> - VACUUM can remove heap blocks easily, but performs much worse on
> indexes, making VACUUM a less good solution. We have now been able to
> speed up index VACUUM, but this require us to scan the whole index for
> correct locking. VACUUM scans the whole table, whereas dead rows may
> well be localised. Heap-needs-vacuum-bitmap has been proposed here, but
> no solution currently exists for vacuuming only parts of indexes and so
> proposals for concurrent vacuums are now being considered.
> 
> The proposal about to be made takes a more radical approach and
> re-examines the architecture of the heap, to allow us to consider much
> faster designs for heavy UPDATEs. Although initially radical, the
> proposal appears to be fully MVCC correct, crash safe as well as being
> much faster under heavy updates, while approximately neutral in other
> cases with no major downsides.

I made a prototypes of Heap-needs-vacuum-bitmap and per-entry-index-deletion.
The test result shows that it saves vacuuming time. I'm refining and making
it robust now.

We can make use of the present structures with the approach, so I have
thought it is a relatively good direction. However, you seem to propose
a whole new storage engine or on-disk-structure. Do you have any viewpoints
that some kinds of extending-VACUUM approach are not enough?
It would be very nice if you could give us some more background.

> - discuss various other approaches to the problem, and why we are now
> proposing one specific approach and receive "why dont we..." feedback
> and additional ideas (Simon)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Indicate disabled triggers in \d

2006-11-06 Thread Brendan Jurd

As discussed briefly on pgsql-hackers, the current psql \d command
does not make any distinction between enabled and disabled triggers.

The attached patch modifies psql's describeOneTableDetails() such that
triggers and disabled triggers are displayed as two separate footer
lists, for example:

Triggers:
  y AFTER DELETE ON x FOR EACH ROW EXECUTE PROCEDURE do_something()
Disabled triggers:
  z BEFORE INSERT ON x FOR EACH ROW EXECUTE PROCEDURE input_stuff()

The patch compiled and tested cleanly on my machine, and passed all
regression tests.

I didn't find any relevant documentation that needed patching, so this
feature add should work fine as a standalone patch.

Regards,
BJ


describe.c.diff
Description: Binary data

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Indicate disabled triggers in \d

2006-11-06 Thread Brendan Jurd

On 11/7/06, Brendan Jurd <[EMAIL PROTECTED]> wrote:

As discussed briefly on pgsql-hackers, the current psql \d command
does not make any distinction between enabled and disabled triggers.

The attached patch modifies psql's describeOneTableDetails() such that
triggers and disabled triggers are displayed as two separate footer
lists, for example:



Minor fix to the previous patch; result7 was not being cleared at the
end of the block.


describe.c.diff
Description: Binary data

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Introducing an advanced Frequent Update Optimization

2006-11-06 Thread Simon Riggs
On Tue, 2006-11-07 at 13:02 +0900, ITAGAKI Takahiro wrote:
> "Simon Riggs" <[EMAIL PROTECTED]> wrote:
> 
> > EnterpriseDB has been running a research project to improve the
> > performance of heavily updated tables. We have a number of approaches
> > prototyped and we'd like to discuss the best of these now on -hackers
> > for community input and patch submission to PostgreSQL core.
> 
> I'm very interested in your proposal! NTT is also working for OLTP workloads,
> especially on improvements of VACUUM. Maybe we have similar problems.

Seems very likely.

> I made a prototypes of Heap-needs-vacuum-bitmap and per-entry-index-deletion.
> The test result shows that it saves vacuuming time. I'm refining and making
> it robust now.
> 
> We can make use of the present structures with the approach, so I have
> thought it is a relatively good direction. However, you seem to propose
> a whole new storage engine or on-disk-structure. Do you have any viewpoints
> that some kinds of extending-VACUUM approach are not enough?

Thats been something we have considered, with good results.

We still need to VACUUM, but in a modified way.

> It would be very nice if you could give us some more background.

Certainly. We'll be posting a full design description on Wednesday; I'm
just editing that now.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Uncleared result sets in describeOneTableDetails()

2006-11-06 Thread Brendan Jurd

While I was poking around in src/bin/psql/describe.c, I noticed that
when the query for inherited tables is opened, the code checks whether
the result is valid and if not, it goes straight to the error_return,
without clearing result sets that may have been open at the time.  See
line 1174 in revision 1.147.

Contrast with other instances of result sets being opened; if it
fails, the code first clears all previously opened result sets, then
goes to error_return (e.g., line 1138).

Is it crucial that result sets be cleared before going out of scope?
If so, this looks like it needs to be patched.

Regards,
BJ

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Uncleared result sets in describeOneTableDetails()

2006-11-06 Thread Tom Lane
"Brendan Jurd" <[EMAIL PROTECTED]> writes:
> Is it crucial that result sets be cleared before going out of scope?

It sounds like it'd leak memory inside psql; but realistically that's
probably not an enormous problem for this usage.  How much uglification
of the code are we talking about to fix it?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Uncleared result sets in describeOneTableDetails()

2006-11-06 Thread Brendan Jurd

On 11/7/06, Tom Lane <[EMAIL PROTECTED]> wrote:

"Brendan Jurd" <[EMAIL PROTECTED]> writes:
> Is it crucial that result sets be cleared before going out of scope?

It sounds like it'd leak memory inside psql; but realistically that's
probably not an enormous problem for this usage.  How much uglification
of the code are we talking about to fix it?



Should be just six extra lines (patch attached, untested).  This isn't
really an uglification of the code, so much as bringing this
particular code segment into line with the existing ugliness standard
of the rest of the function. =)

It certainly isn't pretty.  It's been a long time since I looked down
the barrel of a 'goto'.  This sort of thing feels like it should be
dealt with by RAII or try/catch, but this is C we're talking about.
It's hard to do things gracefully.

Regards,
BJ


describe.c.diff
Description: Binary data

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate