Re: [HACKERS] XLOG_BLCKSZ vs. wal_buffers table

2006-05-03 Thread Zeugswetter Andreas DCP SD

   I'm planning on continuing to increase XLOG_BLCKSZ and wal_buffers
to
   determine when the throughput starts to level out or drop 
  
  I think for an even better comparison you should scale wal_buffers
  down with increasing XLOG_BLCKSZ, so that the xlog buffer has a
fixed
  size in kb.
  
  Reasonable wal_buffers imho amount to at least 256kb, better yet 512
or 1 Mb,
  with sufficiently large transactions (and to try to factor out the
difference 
  between blocksizes).
 
 AFAIK all the transactions in DBT2 are pretty small. I think all DML
is
 single-row in fact, so I'm not sure that having wal_buffers much
larger
 than the number of connections would help much.

Well, but those updates wander around the whole table/index, so you will
have a lot of
before images to write. So I take back the sufficiently large
transactions part
of my comment. You want more wal_buffers in all higher load scenarios.

(one test had 8 buffers of 2k each, this is not enough in any high load
scenario)

Andreas

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] [SoC] Relation between project XML improvements and pgxml

2006-05-03 Thread Robert Staudinger

Hello,

on your summer of code page [1] you outline a project XML improvements.
Is there any relation (similar goals, feature overlapping, technical
relation) to the pgxml project mentioned for some time on [2]? I
have been (remotely) following  Oleg Bartunov's page on GiST usage and
datatypes for some time, now that I'm pondering the submission of a
SoC proposal I'm wondering if the XML improvements project is a
completely new approach, maybe even superceding the approach outlined
by Oleg.

[1] http://www.postgresql.org/developer/summerofcode
[2] http://www.sai.msu.su/~megera/postgres/gist/

Best regards,
Rob Staudinger

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] bug? non working casts for domain

2006-05-03 Thread Fabien COELHO


Dear PostgreSQL developer.

Although it is allowed to create a cast for a domain, it seems that there 
is no way to trigger it. You can find attached an sql script to illustrate 
the issue with postgresql 8.1.3. The create cast and create domain 
documentations do not seem to discuss this point.


ISTM that it is a pg bug. Indeed, either

(1) the create cast should be rejected if it is not allowed for domains.

or

(2) the function should be triggered by explicit casts to the domain.

Have a nice day,

--
Fabien.DROP DOMAIN a_year CASCADE;

-- a simple domain
CREATE DOMAIN a_year AS INTEGER
CHECK (VALUE BETWEEN 1 AND 3000);

-- ok
SELECT 1::a_year;
SELECT CAST('2000' AS a_year);

-- fails as expected
SELECT 0::a_year; 

CREATE FUNCTION date2year(DATE)
RETURNS a_year IMMUTABLE STRICT AS $$
SELECT EXTRACT(YEAR FROM $1)::a_year;
$$ LANGUAGE sql;

-- ok
SELECT date2year(CURRENT_DATE);

-- fails as expected
SELECT date2year(DATE '3001-01-01');

CREATE CAST (DATE AS a_year) 
WITH FUNCTION date2year(DATE);

-- fails, I would expect 1970
SELECT (DATE '1970-03-20')::a_year;

-- fails, I would expect the current year
SELECT CURRENT_DATE::a_year;
SELECT CAST(CURRENT_DATE AS a_year);

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Automatic free space map filling

2006-05-03 Thread Hannu Krosing
Ühel kenal päeval, R, 2006-03-03 kell 11:37, kirjutas Tom Lane:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  So for you it would certainly help a lot to be able to vacuum the first
  X pages of the big table, stop, release locks, create new transaction,
  continue with the next X pages, lather, rinse, repeat.
 
  This is perfectly doable, it only needs enough motivation from a
  knowledgeable person.
 
 Bruce and I were discussing this the other day; it'd be pretty easy to
 make plain VACUUM start a fresh transaction immediately after it
 finishes a scan heap/clean indexes/clean heap cycle.  

Do you mean the full (scan heap/clean indexes/clean heap) cycle or some
smaller cycles inside each step ?

If you mean the full cycle, then it is probably not worth it, as even a
single 'clean index' pass can take hours on larger tables.

 The infrastructure
 for this (in particular, session-level locks that won't be lost by
 closing the xact) is all there.  You'd have to figure out how often to
 start a new xact ... every cycle is probably too often, at least for
 smaller maintenance_work_mem settings ... but it'd not be hard or
 involve any strange changes in system semantics.

---
Hannu



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] drop database command blocking other connections

2006-05-03 Thread Jim Buttafuoco
from time to time I have to drop a very large database (1TB+).  The drop 
database command takes a long time to complete
while its deleting the files.  During this time, no one can connect to the 
database server, ps displays startup
waiting.  This is with Postgresql 7.4.  Has this been addressed in 8.1, if 
not, does anyone have some ideas on how to
speed this up.

thanks
Jim


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] patch review, please: Autovacuum/Vacuum times via stats.

2006-05-03 Thread Martijn van Oosterhout
On Tue, May 02, 2006 at 05:49:33PM -0500, Jim C. Nasby wrote:
 Back in the discussion of this someone had mentioned capturing all the
 info that you'd get from a vacuum verbose; dead tuples, etc. What do
 people think about that? In particular I think it'd be handy to know how
 many pages vacuum wanted in the FSM vs. how many it got; this would make
 it much easier for people to ensure that the FSM is large enough. Using
 the functions that let you query the FSM won't work because they can't
 tell you if there are pages that should have been in the FSM but didn't
 make it in.

That's a good idea too, but in that case I'd vote for putting it into a
seperate table/view and not with the stats relating to number of seq
scans for example.
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] Is a SERIAL column a black box, or not?

2006-05-03 Thread Jim C. Nasby
On Tue, May 02, 2006 at 07:45:13PM -0700, elein wrote:
 On Tue, May 02, 2006 at 12:00:42PM -0500, Jim C. Nasby wrote:
  On Mon, May 01, 2006 at 06:43:00PM -0700, elein wrote:
   On Mon, May 01, 2006 at 07:47:06PM -0400, Tom Lane wrote:
Jim C. Nasby [EMAIL PROTECTED] writes:
 I think a big point that's being missed here is that SERIAL *is* 
 trying
 to be simple. If you need something more sophisticated or complex you
 shouldn't be using SERIAL at all, you should be doing the stuff
 yourself, by hand.

I agree with this point in the abstract, but one important proviso is
that it has to be *possible* to do it by hand.  One good thing about
the SERIAL is just a macro approach is that it keeps us honest about
making sure that SERIAL isn't exploiting any weird internal behaviors
that are hard to duplicate for handmade sequence defaults.  We've
already broken that to some extent by having the hidden dependency,
and that in turn means that fairly-reasonable expectations like
pg_get_serial_sequence should find the column's associated sequence
don't work on handmade sequences.  I don't want to go much further in
that direction.  If there's a usability problem we're trying to solve
for SERIALs, we should make sure the problem gets solved for handmade
sequences too.
   
   I agree with Tom's proviso and add one of my own, mentioned earlier.
   It should be easy to use a sequence w/alter sequence almost all of
   the time.  The majority of the crowd should be able to use SERIAL in
   the majority of cases.  One reason I am adamant about this is the
   v. useful dependencies that are (should be) set between the table 
   and the sequence when it is declared as a SERIAL.
  
  I agree that we shouldn't be arbitrarily removing functionality from
  SERIALs that would exist with a hand-grown sequence unless there's good
  reason.
  
  I'm wondering if it would be best to essentially promote SERIALs to
  being their own type of object? So instead of relying on a naming
  convention or pg_get_serial_sequence to then make calls that touch the
  underlying sequence (which probably shouldn't be directly accessible),
  create functions/syntax that allows the required operations on a SERIAL
  itself, such as table.column.nextval(), or nextval(table.column).
  
  Another way to look at this is how we handle VIEWS. Viwes are
  implimented under-the-covers as a rule and some hidden table, yet we
  don't support (or even allow?) people mucking with the stuff that's
  under the hood. I think it would be best from a user standpoint if we
  took the same approach with SERIAL, as long as we provide most of the
  power that users would have from going the manual sequence route (I say
  most because there's probably some oddball cases that wouldn't make
  sense supporting, such as two SERIALS operating off the same sequence).
 
 This is not what I meant.  I meant that most things should be able to be
 done by a combination of a SERIAL column definition plus ALTER SERIAL.
 But there are other reasons to have sequences as stand alone objects.
 
I'm certainly not suggesting we remove sequences. What I'm saying is
that because a serial is intended to be a time saver, it should act like
one. That means no need to grant seperate permissions, and when you drop
the table or column, the serial should go away as well.

 And don't get me started on how you cannot create a select rule.
 In that case the code to prevent proper use of create rules is probably
 as extensive as the code to implement views.

Uhm, according to the docs you can create select rules. Or are you
suggesting that people should be able to muck around with the internals
of a view?
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] sblock state on FreeBSD 6.1

2006-05-03 Thread Jim C. Nasby
On Tue, May 02, 2006 at 11:06:59PM -0400, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  Just experienced a server that was spending over 50% of CPU time in the
  system, apparently dealing with postmasters that were in the sblock
  state. Looking at the FreeBSD source, this indicates that the process is
  waiting for a lock on a socket. During this time the machine was doing
  nearly 200k context switches a second.
 
 Which operations require such a lock?  If plain read/write needs the
 lock then heavy contention is hardly surprising.

From what little I've been able to decypher of the FBSD kernel source,
it appears that socket creation and destruction requires the lock, as
well as (at least) writing to it, but in the latter case it depends on
some flags/options.

  Any ideas what areas of the code could be locking a socket?
  Theoretically it shouldn't be the stats collector, and the site is using
  pgpool as a connection pool, so this shouldn't be due to trying to
  connect to backends at a furious rate.
 
 Actually, the stats socket seems like a really good bet to me, since all
 the backends will be interested in the same socket.  The
 client-to-backend sockets are only touched by two processes each, so
 don't seem like big contention sources.

Do we take specific steps to ensure that we don't block when attempting
to write to these sockets? I *think* there's a flag that's associated
with the socket descriptor that determines locking behavior, but I
haven't been able to find a great deal of info.

Right now the only way I can think of to try and reproduce this is to
modify the code so that we're passing a much larger amount of data to
the stats logger and then fire up pgbench. But I know there's been some
discussion about changing things so that we won't drop stats messages,
so maybe it's a moot point.

BTW, this server does have command string logging on, so if this is a
stats issue that probably made the problem worse. Would it be practical
to have backends only log the command string if the command runs for
more than some number of milliseconds? I suspect there's very little
case for actually trying to log every single command, so realistically
people are only going to care about commands that are taking a decent
amount of time.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] inclusion of hstore software in main tarball

2006-05-03 Thread hubert depesz lubaczewski
hisome time ago i was told on postgresql-general about existence of hstore package for postgresql.as i understand it is being developed by the same guys that are behind tsearch2, gin and (at least) recent changes in gist.
would it be possible to include this software in main postgresql tarball? it would be best if it came as standard feature, but contrib module would also work.if you are not familiar with hstore - this is new datatype for postgresql which allows you to store (and browse and search) any number of pairs of (key, value) in one field.
i have number of projects coming to my mind which could take advantage of something like this. the simplest thing is e-commerce site, where all custom fields (fields dependand on product type) can be stored in one, easily retrievable, field - thus reducing database load by skipping additional table scans for custom field table(s).
if you would like to check it: url is: http://www.sai.msu.su/~megera/postgres/gist/hstore/README.hstore, and the software itself is downloadable from 
http://www.sai.msu.su/~megera/postgres/gist/best regardshubert


Re: [HACKERS] sblock state on FreeBSD 6.1

2006-05-03 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 On Tue, May 02, 2006 at 11:06:59PM -0400, Tom Lane wrote:
 Actually, the stats socket seems like a really good bet to me, since all
 the backends will be interested in the same socket.  The
 client-to-backend sockets are only touched by two processes each, so
 don't seem like big contention sources.

 Do we take specific steps to ensure that we don't block when attempting
 to write to these sockets?

Well, we have the socket set to O_NONBLOCK mode.  Whether that avoids
the problem you're seeing ...

 BTW, this server does have command string logging on, so if this is a
 stats issue that probably made the problem worse.

Can you turn that off for a bit and see if it affects things?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] inclusion of hstore software in main tarball

2006-05-03 Thread Lukas Smith

hubert depesz lubaczewski wrote:

i have number of projects coming to my mind which could take advantage 
of something like this. the simplest thing is e-commerce site, where all 
custom fields (fields dependand on product type) can be stored in one, 
easily retrievable, field - thus reducing database load by skipping 
additional table scans for custom field table(s).


slightly_offtopic
I attended a talk on DB'2 XML Viper stuff the other day. And they were 
touting its high performance XML/XQuery support for stuff like this 
(their favorite example seems to be meeting appointments). Essentially 
they were saying a lot of things simply do not fit a fixed storage 
layout, but there are enough similarities between data sets that you can 
still benefit from indexing things.


However their stuff also supports nested structures due to their XML 
nature. Also due to the XML nature you can optionally check all data 
going in to follow an xsd to prevent people from going all too crazy.

/slightly_offtopic

regards,
Lukas

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] patch review, please: Autovacuum/Vacuum times via stats.

2006-05-03 Thread Jim C. Nasby
On Wed, May 03, 2006 at 03:54:57PM +0200, Martijn van Oosterhout wrote:
 On Tue, May 02, 2006 at 05:49:33PM -0500, Jim C. Nasby wrote:
  Back in the discussion of this someone had mentioned capturing all the
  info that you'd get from a vacuum verbose; dead tuples, etc. What do
  people think about that? In particular I think it'd be handy to know how
  many pages vacuum wanted in the FSM vs. how many it got; this would make
  it much easier for people to ensure that the FSM is large enough. Using
  the functions that let you query the FSM won't work because they can't
  tell you if there are pages that should have been in the FSM but didn't
  make it in.
 
 That's a good idea too, but in that case I'd vote for putting it into a
 seperate table/view and not with the stats relating to number of seq
 scans for example.

Agreed; it doesn't really make much sense to me to be putting info about
vacuum in the stat view anyway; ISTM it should be a stand-alone view.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] inclusion of hstore software in main tarball

2006-05-03 Thread Jim C. Nasby
On Wed, May 03, 2006 at 07:06:09PM +0200, hubert depesz lubaczewski wrote:
 hi
 some time ago i was told on postgresql-general about existence of hstore
 package for postgresql.
 as i understand it is being developed by the same guys that are behind
 tsearch2, gin and (at least) recent changes in gist.
 
 would it be possible to include this software in main postgresql tarball? it
 would be best if it came as standard feature, but contrib module would also
 work.

Why put it in contrib unless it depends on being in there? Just grab it
from pgFoundry if you need it.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] sblock state on FreeBSD 6.1

2006-05-03 Thread Jim C. Nasby
On Wed, May 03, 2006 at 01:37:03PM -0400, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  On Tue, May 02, 2006 at 11:06:59PM -0400, Tom Lane wrote:
  Actually, the stats socket seems like a really good bet to me, since all
  the backends will be interested in the same socket.  The
  client-to-backend sockets are only touched by two processes each, so
  don't seem like big contention sources.
 
  Do we take specific steps to ensure that we don't block when attempting
  to write to these sockets?
 
 Well, we have the socket set to O_NONBLOCK mode.  Whether that avoids
 the problem you're seeing ...

A quick grep through the source code doesn't look too promising, so
maybe that's not the proper way not to block on FBSD. Though Larry was
telling me that there's recently been changes made in the socket code,
so maybe this problem was fixed recently.

  BTW, this server does have command string logging on, so if this is a
  stats issue that probably made the problem worse.
 
 Can you turn that off for a bit and see if it affects things?

That would require being able to easily reproduce the problem, which I'm
not sure will be possible, since the site was handling over 400
concurrent requests at a time from the web when this happened. That's
why I'm wondering if it might be a better idea to test on another
machine with a copy of the code that's been hacked to send a big pile of
data to the stats process with every query. That would hopefully allow
pgbench to exhibit the same behavior.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Automatic free space map filling

2006-05-03 Thread Tom Lane
Hannu Krosing [EMAIL PROTECTED] writes:
 If you mean the full cycle, then it is probably not worth it, as even a
 single 'clean index' pass can take hours on larger tables.

The patch Heikki is working on will probably alleviate that problem,
because it will allow vacuum to scan the indexes in physical rather than
logical order.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] sblock state on FreeBSD 6.1

2006-05-03 Thread Larry Rosenman
Jim C. Nasby wrote:
 On Wed, May 03, 2006 at 01:37:03PM -0400, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
 On Tue, May 02, 2006 at 11:06:59PM -0400, Tom Lane wrote:
 Actually, the stats socket seems like a really good bet to me,
 since all the backends will be interested in the same socket.  The
 client-to-backend sockets are only touched by two processes each,
 so don't seem like big contention sources.
 
 Do we take specific steps to ensure that we don't block when
 attempting to write to these sockets?
 
 Well, we have the socket set to O_NONBLOCK mode.  Whether that avoids
 the problem you're seeing ...
 
 A quick grep through the source code doesn't look too promising, so
 maybe that's not the proper way not to block on FBSD. Though Larry was
 telling me that there's recently been changes made in the socket code,
 so maybe this problem was fixed recently.

I didn't see a direct hit looking at the routines we talked about
yesterday, but
I can't be sure. 

-- 
Larry Rosenman  
Database Support Engineer

PERVASIVE SOFTWARE. INC.
12365B RIATA TRACE PKWY
3015
AUSTIN TX  78727-6531 

Tel: 512.231.6173
Fax: 512.231.6597
Email: [EMAIL PROTECTED]
Web: www.pervasive.com 

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] drop database command blocking other connections

2006-05-03 Thread Tom Lane
Jim Buttafuoco [EMAIL PROTECTED] writes:
 from time to time I have to drop a very large database (1TB+).  The drop 
 database command takes a long time to complete
 while its deleting the files.  During this time, no one can connect to the 
 database server, ps displays startup
 waiting.  This is with Postgresql 7.4.  Has this been addressed in 8.1, if 
 not, does anyone have some ideas on how to
 speed this up.

No, it'll probably behave the same in CVS HEAD.  The problem is that
DROP DATABASE has to lock out new connections to the victim database,
and the mechanism it's using for that is a table-level lock on
pg_database, rather than something specific to one database.  So
new connections to *all* DBs in the cluster will be blocked while
DROP DATABASE runs.

It strikes me that we could fix this by taking out special locks on the
database as an object (using LockSharedObject) instead of relying on
locking pg_database.  There wasn't any locktag convention that'd work
for that back in 7.4, but it surely seems doable now.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] patch review, please: Autovacuum/Vacuum times via stats.

2006-05-03 Thread Tom Lane
 On Tue, May 02, 2006 at 05:49:33PM -0500, Jim C. Nasby wrote:
 Back in the discussion of this someone had mentioned capturing all the
 info that you'd get from a vacuum verbose; dead tuples, etc. What do
 people think about that? In particular I think it'd be handy to know how
 many pages vacuum wanted in the FSM vs. how many it got; this would make
 it much easier for people to ensure that the FSM is large enough.

Isn't this already dealt with by contrib/pg_freespacemap?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] drop database command blocking other connections

2006-05-03 Thread Jim Buttafuoco
Tom,

I am trying to migrate all of my database from 7.4 to 8.1, It takes alot of 
disk space to have both online at the same
time.  I have done around 2TB of actual disk space to date and have another 6TB 
to do over the next month or so.  I have
been moving (with pg_dump 7.4db | pg_dump 8.1db) each database to 8.1 and then 
dropping the 7.4 one (after some
testing).  I would be nice if this is fixed so when I have to move from 8.1 to 
8.2 it will not be an issue.

Thanks for your time
Jim



-- Original Message ---
From: Tom Lane [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: pgsql-hackers pgsql-hackers@postgresql.org
Sent: Wed, 03 May 2006 14:23:08 -0400
Subject: Re: [HACKERS] drop database command blocking other connections 

 Jim Buttafuoco [EMAIL PROTECTED] writes:
  from time to time I have to drop a very large database (1TB+).  The drop 
  database command takes a long time to complete
  while its deleting the files.  During this time, no one can connect to the 
  database server, ps displays startup
  waiting.  This is with Postgresql 7.4.  Has this been addressed in 8.1, if 
  not, does anyone have some ideas on how to
  speed this up.
 
 No, it'll probably behave the same in CVS HEAD.  The problem is that
 DROP DATABASE has to lock out new connections to the victim database,
 and the mechanism it's using for that is a table-level lock on
 pg_database, rather than something specific to one database.  So
 new connections to *all* DBs in the cluster will be blocked while
 DROP DATABASE runs.
 
 It strikes me that we could fix this by taking out special locks on the
 database as an object (using LockSharedObject) instead of relying on
 locking pg_database.  There wasn't any locktag convention that'd work
 for that back in 7.4, but it surely seems doable now.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend
--- End of Original Message ---


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Is a SERIAL column a black box, or not?

2006-05-03 Thread elein
On Wed, May 03, 2006 at 10:12:28AM -0500, Jim C. Nasby wrote:
 On Tue, May 02, 2006 at 07:45:13PM -0700, elein wrote:
  On Tue, May 02, 2006 at 12:00:42PM -0500, Jim C. Nasby wrote:
   On Mon, May 01, 2006 at 06:43:00PM -0700, elein wrote:
On Mon, May 01, 2006 at 07:47:06PM -0400, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  I think a big point that's being missed here is that SERIAL *is* 
  trying
  to be simple. If you need something more sophisticated or complex 
  you
  shouldn't be using SERIAL at all, you should be doing the stuff
  yourself, by hand.
 
 I agree with this point in the abstract, but one important proviso is
 that it has to be *possible* to do it by hand.  One good thing about
 the SERIAL is just a macro approach is that it keeps us honest about
 making sure that SERIAL isn't exploiting any weird internal behaviors
 that are hard to duplicate for handmade sequence defaults.  We've
 already broken that to some extent by having the hidden dependency,
 and that in turn means that fairly-reasonable expectations like
 pg_get_serial_sequence should find the column's associated sequence
 don't work on handmade sequences.  I don't want to go much further in
 that direction.  If there's a usability problem we're trying to solve
 for SERIALs, we should make sure the problem gets solved for handmade
 sequences too.

I agree with Tom's proviso and add one of my own, mentioned earlier.
It should be easy to use a sequence w/alter sequence almost all of
the time.  The majority of the crowd should be able to use SERIAL in
the majority of cases.  One reason I am adamant about this is the
v. useful dependencies that are (should be) set between the table 
and the sequence when it is declared as a SERIAL.
   
   I agree that we shouldn't be arbitrarily removing functionality from
   SERIALs that would exist with a hand-grown sequence unless there's good
   reason.
   
   I'm wondering if it would be best to essentially promote SERIALs to
   being their own type of object? So instead of relying on a naming
   convention or pg_get_serial_sequence to then make calls that touch the
   underlying sequence (which probably shouldn't be directly accessible),
   create functions/syntax that allows the required operations on a SERIAL
   itself, such as table.column.nextval(), or nextval(table.column).
   
   Another way to look at this is how we handle VIEWS. Viwes are
   implimented under-the-covers as a rule and some hidden table, yet we
   don't support (or even allow?) people mucking with the stuff that's
   under the hood. I think it would be best from a user standpoint if we
   took the same approach with SERIAL, as long as we provide most of the
   power that users would have from going the manual sequence route (I say
   most because there's probably some oddball cases that wouldn't make
   sense supporting, such as two SERIALS operating off the same sequence).
  
  This is not what I meant.  I meant that most things should be able to be
  done by a combination of a SERIAL column definition plus ALTER SERIAL.
  But there are other reasons to have sequences as stand alone objects.
  
 I'm certainly not suggesting we remove sequences. What I'm saying is
 that because a serial is intended to be a time saver, it should act like
 one. That means no need to grant seperate permissions, and when you drop
 the table or column, the serial should go away as well.
 
  And don't get me started on how you cannot create a select rule.
  In that case the code to prevent proper use of create rules is probably
  as extensive as the code to implement views.
 
 Uhm, according to the docs you can create select rules. Or are you
 suggesting that people should be able to muck around with the internals
 of a view?

I warned you not to get me started :)  I retract my little side swipe
to avoid going into that discussion here and now.  This is not the
appropriate thread.  (But that does not mean I do not have opinions 
about the limitations of select rules, etc. :)

~elein

 -- 
 Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
 Pervasive Software  http://pervasive.comwork: 512-231-6117
 vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq
 

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Rethinking locking for database create/drop vs connection startup

2006-05-03 Thread Tom Lane
This is motivated by Jim Buttafuoco's recent gripe about not being
able to connect while a DROP DATABASE is in progress:
http://archives.postgresql.org/pgsql-hackers/2006-05/msg00074.php
The whole area is really pretty grotty anyway --- the existing interlock
does not prevent an incoming connection from trying to connect to the
victim database, only make sure that we detect it later.  This is not
very good, for two reasons.  One is that you'll most likely get a very
unfriendly error message due to attempts to access already-missing
system catalogs; when I experimented just now I got

psql: FATAL:  could not open relation 1663/104854/1259: No such file or 
directory

which is really not the way I'd like to report database foo just got
deleted.  The other problem is that I'm not entirely convinced that a
backend trying to do this won't leave any permanent problems behind,
most likely in the form of dirty shared buffers for subsequently-deleted
system catalogs in the victim database.  ReverifyMyDatabase tries to
clean that up by doing DropDatabaseBuffers, but that only helps if we
get as far as ReverifyMyDatabase.

It strikes me that we now have a decent tool for solving the problem,
which is LockSharedObject() --- that is, there exists a locktag
convention whereby we can take a lock on a database as such, rather
than having to use table-level locks on pg_database as proxy.  The
locktag would be in the form of an OID so it would identify a DB by
OID.  If dropdb() takes such a lock before it checks for active
backends, then the connection sequence can look like this:

1. read pg_database flat file to find out OID of target DB
2. initialize far enough to be able to start a transaction,
   and do so
3. take a shared lock on the target DB by OID
4. re-read pg_database flat file and verify DB still exists

If step 4 fails to find the DB in the flat file, then we can bomb
out before we've made any attempt to touch catalogs of the target
DB.  This ensures both a reasonable error message, and no pollution
of shared buffers.  If we get past step 4 then we don't have to worry
about concurrent dropdb() anymore.  (The shared lock will only last
until we commit the startup transaction, but that's OK --- once we
are listed in the PGPROC array we don't need the lock anymore.)

It's slightly annoying to have to read the flat file twice, but
for reasonable numbers of databases per installation I don't think
this will pose any material performance penalty.  The file will
certainly still be sitting in kernel disk cache.

It's still necessary to serialize CREATE/DROP DATABASE commands against
each other, to ensure that only one backend tries to write the flat file
at a time, but with this scheme they'd not need to block connections
being made to unrelated databases.

Thoughts, better ideas?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [SoC] Relation between project XML improvements and pgxml

2006-05-03 Thread Jonah H. Harris

On 5/3/06, Robert Staudinger [EMAIL PROTECTED] wrote:

on your summer of code page [1] you outline a project XML improvements.
Is there any relation (similar goals, feature overlapping, technical
relation) to the pgxml project mentioned for some time on [2]?


No, the XML project idea submitted did not include Oleg's stuff at all.


now that I'm pondering the submission of a
SoC proposal I'm wondering if the XML improvements project is a
completely new approach, maybe even superceding the
approach outlined


At this point in time, I don't believe there's any single best way to
go regarding XML-handling in PostgreSQL.  If you have a neat project
idea, please propose it to us on Google's site!

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] drop database command blocking other connections

2006-05-03 Thread Tony Wasson

On 5/3/06, Jim Buttafuoco [EMAIL PROTECTED] wrote:

from time to time I have to drop a very large database (1TB+).  The drop 
database command takes a long time to complete
while its deleting the files.  During this time, no one can connect to the database 
server, ps displays startup
waiting.  This is with Postgresql 7.4.  Has this been addressed in 8.1, if 
not, does anyone have some ideas on how to
speed this up.


I don't have a fix, but I can offer a workaround.

When we need to drop large DBs we drop them a schema at a time. DROP
SCHEMA does *not* block new connections into the server. Once the data
it out of the schema(s), a DROP DATABASE on a nearly empty database
does not block new connections for more than a moment.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] drop database command blocking other connections

2006-05-03 Thread Jim Buttafuoco
nice workaround, I am going to modify my procedure to drop the public schema 
first (it is the largest one).


-- Original Message ---
From: Tony Wasson [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: pgsql-hackers pgsql-hackers@postgresql.org
Sent: Wed, 3 May 2006 14:09:05 -0700
Subject: Re: [HACKERS] drop database command blocking other connections

 On 5/3/06, Jim Buttafuoco [EMAIL PROTECTED] wrote:
  from time to time I have to drop a very large database (1TB+).  The drop 
  database command takes a long time to 
complete
  while its deleting the files.  During this time, no one can connect to the 
  database server, ps displays startup
  waiting.  This is with Postgresql 7.4.  Has this been addressed in 8.1, if 
  not, does anyone have some ideas on 
how to
  speed this up.
 
 I don't have a fix, but I can offer a workaround.
 
 When we need to drop large DBs we drop them a schema at a time. DROP
 SCHEMA does *not* block new connections into the server. Once the data
 it out of the schema(s), a DROP DATABASE on a nearly empty database
 does not block new connections for more than a moment.
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster
--- End of Original Message ---


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [SoC] Relation between project XML improvements and pgxml

2006-05-03 Thread Nikolay Samokhvalov

Yes, there is no single way, because there are different tasks. There
are many papers on this theme.

I'm pretty sure that first of all we need to analyze other DBMSes'
experience. I'm working on it, analyzing MS, ORA and DB2 (first
results of analysis will be available in several weeks).

I've submitted proposal 'XMLType for PostgreSQL' to Google SoC page
(my 'minimum' list from here:
http://archives.postgresql.org/pgsql-hackers/2006-05/msg00044.php

On 5/4/06, Jonah H. Harris [EMAIL PROTECTED] wrote:

On 5/3/06, Robert Staudinger [EMAIL PROTECTED] wrote:
 on your summer of code page [1] you outline a project XML improvements.
 Is there any relation (similar goals, feature overlapping, technical
 relation) to the pgxml project mentioned for some time on [2]?

No, the XML project idea submitted did not include Oleg's stuff at all.

 now that I'm pondering the submission of a
 SoC proposal I'm wondering if the XML improvements project is a
 completely new approach, maybe even superceding the
 approach outlined

At this point in time, I don't believe there's any single best way to
go regarding XML-handling in PostgreSQL.  If you have a neat project
idea, please propose it to us on Google's site!

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings




--
Best regards,
Nikolay

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[HACKERS] Warts with SELECT DISTINCT

2006-05-03 Thread Greg Stark


Normally Postgres extends SQL to allow ORDER BY to include arbitrary
expressions not in the select list. However this doesn't seem to work with
SELECT DISTINCT.

  stark= \d test
  Table public.test
   Column | Type | Modifiers 
  +--+---
   col1   | text | 

  stark= select distinct col1 from test order by upper(col1);
  ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select list


It seems like as long as the expressions involve only columns or expressions
present in the SELECT DISTINCT list and as long as those functions are stable
or immutable then this shouldn't be a problem. Just prepend those expressions
to the select list to use as the sort key.

In fact the equivalent GROUP BY query does work as expected:

stark= select col1 from test group by col1 order by upper(col1);
 col1 
--
 a
 c
 x
(3 rows)


Though it's optimized poorly and does a superfluous sort step:

stark= explain select col1 from test group by col1 order by upper(col1);
QUERY PLAN 
---
 Sort  (cost=99.72..100.22 rows=200 width=32)
   Sort Key: upper(col1)
   -  Group  (cost=85.43..92.08 rows=200 width=32)
 -  Sort  (cost=85.43..88.50 rows=1230 width=32)
   Sort Key: col1
   -  Seq Scan on test  (cost=0.00..22.30 rows=1230 width=32)
(6 rows)


Whereas it shouldn't be hard to prove that this is equivalent:

stark= explain select col1 from test group by upper(col1),col1 order by 
upper(col1);
 QUERY PLAN  
-
 Group  (cost=88.50..98.23 rows=200 width=32)
   -  Sort  (cost=88.50..91.58 rows=1230 width=32)
 Sort Key: upper(col1), col1
 -  Seq Scan on test  (cost=0.00..25.38 rows=1230 width=32)
(4 rows)


My understanding is that the DISTINCT and DISTINCT ON code path is old and
grotty. Perhaps it's time to remove those code paths, and replace them with a
transformation that creates the equivalent GROUP BY query and then optimize
that path until it can produce plans as good as DISTINCT and DISTINCT ON ever
did.

-- 
greg


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [SoC] Relation between project XML improvements and pgxml

2006-05-03 Thread Nikolay Samokhvalov

Actually, project mentioned on Oleg's page is only in plan.

I see some gap between current moment and the moment when GiST will
come to power for XML support - check out my proposal
(http://archives.postgresql.org/pgsql-hackers/2006-05/msg00044.php),
the 'minumum' list is that gap. When we will have some basic support
(ideally, according SQL:200n SQL/XML standard and based on experience
taken from commercial DBMSes), we would work on index support (w/o
which this project definitely won't be applicable to production
purposes) - first of all, path indexes and structure indexes. This
includes some labeling schema (probably prefix schema, see
http://davis.wpi.edu/dsrg/vamana/WebPages/Publication.html or papers
about MS' ORDPATHs). GiST and Gis will definitely help here a lot.

On 5/3/06, Robert Staudinger [EMAIL PROTECTED] wrote:

Hello,

on your summer of code page [1] you outline a project XML improvements.
Is there any relation (similar goals, feature overlapping, technical
relation) to the pgxml project mentioned for some time on [2]? I
have been (remotely) following  Oleg Bartunov's page on GiST usage and
datatypes for some time, now that I'm pondering the submission of a
SoC proposal I'm wondering if the XML improvements project is a
completely new approach, maybe even superceding the approach outlined
by Oleg.

[1] http://www.postgresql.org/developer/summerofcode
[2] http://www.sai.msu.su/~megera/postgres/gist/

Best regards,
Rob Staudinger

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq




--
Best regards,
Nikolay

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Warts with SELECT DISTINCT

2006-05-03 Thread Bruno Wolff III
On Wed, May 03, 2006 at 17:58:07 -0400,
  Greg Stark [EMAIL PROTECTED] wrote:
 
 Though it's optimized poorly and does a superfluous sort step:
 
 stark= explain select col1 from test group by col1 order by upper(col1);
 QUERY PLAN 
 ---
  Sort  (cost=99.72..100.22 rows=200 width=32)
Sort Key: upper(col1)
-  Group  (cost=85.43..92.08 rows=200 width=32)
  -  Sort  (cost=85.43..88.50 rows=1230 width=32)
Sort Key: col1
-  Seq Scan on test  (cost=0.00..22.30 rows=1230 width=32)
 (6 rows)
 
 
 Whereas it shouldn't be hard to prove that this is equivalent:
 
 stark= explain select col1 from test group by upper(col1),col1 order by 
 upper(col1);
  QUERY PLAN  
 -
  Group  (cost=88.50..98.23 rows=200 width=32)
-  Sort  (cost=88.50..91.58 rows=1230 width=32)
  Sort Key: upper(col1), col1
  -  Seq Scan on test  (cost=0.00..25.38 rows=1230 width=32)
 (4 rows)

I don't think you can assume that that will be true for any locale. If there
are two different characters that both have the same uppercase version, this
will break things.

And while you would expect that x = y = upper(x) = upper(y) I am not sure
that is guarenteed for locales. I can imagine having two different characters
that are treated the same for ordering purposes, but have uppercase versions
that are considered different for ordering purposes.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Warts with SELECT DISTINCT

2006-05-03 Thread Greg Stark
Bruno Wolff III [EMAIL PROTECTED] writes:

  Whereas it shouldn't be hard to prove that this is equivalent:
  
  stark= explain select col1 from test group by upper(col1),col1 order by 
  upper(col1);
   QUERY PLAN  
  -
   Group  (cost=88.50..98.23 rows=200 width=32)
 -  Sort  (cost=88.50..91.58 rows=1230 width=32)
   Sort Key: upper(col1), col1
   -  Seq Scan on test  (cost=0.00..25.38 rows=1230 width=32)
  (4 rows)
 
 I don't think you can assume that that will be true for any locale. If there
 are two different characters that both have the same uppercase version, this
 will break things.

No it won't.

-- 
greg


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Warts with SELECT DISTINCT

2006-05-03 Thread Bruno Wolff III
On Thu, May 04, 2006 at 00:05:16 -0400,
  Greg Stark [EMAIL PROTECTED] wrote:
 Bruno Wolff III [EMAIL PROTECTED] writes:
 
   Whereas it shouldn't be hard to prove that this is equivalent:
   
   stark= explain select col1 from test group by upper(col1),col1 order by 
   upper(col1);
QUERY PLAN  
   -
Group  (cost=88.50..98.23 rows=200 width=32)
  -  Sort  (cost=88.50..91.58 rows=1230 width=32)
Sort Key: upper(col1), col1
-  Seq Scan on test  (cost=0.00..25.38 rows=1230 width=32)
   (4 rows)
  
  I don't think you can assume that that will be true for any locale. If there
  are two different characters that both have the same uppercase version, this
  will break things.
 
 No it won't.

Sure it will, because when you do the group by you will get a different
number of groups. When grouping by the original characters you will get
separate groups for characters that have the same uppercase character, where
as when grouing by the uppercased characters you won't.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Warts with SELECT DISTINCT

2006-05-03 Thread Tom Lane
Bruno Wolff III [EMAIL PROTECTED] writes:
 Greg Stark [EMAIL PROTECTED] wrote [ baldly summarized ]
 [ x  y implies upper(x)  upper(y) ]

 I don't think you can assume that that will be true for any locale.

Whether or not that may actually be true for upper() (I share Bruno's
skepticism, but maybe it's so), it really does not matter because the
planner doesn't have enough knowledge about the behavior of upper() to
make such an inference.

I think it's a fair point that we could allow SELECT DISTINCT x ORDER BY
foo(x) if foo() is stable, but that does not imply that sorting by x is
interchangeable with sorting by foo(x).  foo = abs is a trivial
counterexample.

As far as the original point goes: feel free to reimplement DISTINCT,
but don't break the documented behavior of DISTINCT ON + ORDER BY, or
you'll have a lot of unhappy villagers appear on your doorstep bearing
torches and pitchforks.  It might be mostly an implementation artifact,
but it's an awfully useful one ...

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Rethinking locking for database create/drop vs connection

2006-05-03 Thread Christopher Kings-Lynne

It's slightly annoying to have to read the flat file twice, but
for reasonable numbers of databases per installation I don't think
this will pose any material performance penalty.  The file will
certainly still be sitting in kernel disk cache.


Dropping a db isn't exactly a common occurrence anyway no?


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Warts with SELECT DISTINCT

2006-05-03 Thread Greg Stark
Bruno Wolff III [EMAIL PROTECTED] writes:

 On Thu, May 04, 2006 at 00:05:16 -0400,
   Greg Stark [EMAIL PROTECTED] wrote:
  Bruno Wolff III [EMAIL PROTECTED] writes:
  
Whereas it shouldn't be hard to prove that this is equivalent:

stark= explain select col1 from test group by upper(col1),col1 order 
by upper(col1);
 QUERY PLAN  
-
 Group  (cost=88.50..98.23 rows=200 width=32)
   -  Sort  (cost=88.50..91.58 rows=1230 width=32)
 Sort Key: upper(col1), col1
 -  Seq Scan on test  (cost=0.00..25.38 rows=1230 width=32)
(4 rows)
   
   I don't think you can assume that that will be true for any locale. If 
   there
   are two different characters that both have the same uppercase version, 
   this
   will break things.
  
  No it won't.
 
 Sure it will, because when you do the group by you will get a different
 number of groups. When grouping by the original characters you will get
 separate groups for characters that have the same uppercase character, where
 as when grouing by the uppercased characters you won't.

But grouping on *both* will produce the same groups as grouping on the
original characters alone.


-- 
greg


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org