Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Gavin Sherry
On Thu, 16 Jun 2005, Matthew T. O'Connor wrote:

 Gavin Sherry wrote:

 I guess my main concern is that we'll have a solution to the problem of
 dead tuples which is only half way there. It is only an incremental
 improvement upon the contrib module and solves only one real problem:
 users do not read up on VACUUM or autovacuum. This is at the expense of
 making it appear to be suitable for the general user base when it isn't,
 in my opinion. That isn't the fault of autovacuum but is a function of the
 cost of ordinary vacuum.
 
 

 Would you mind expounding on why you think autovacuum isn't suitable for
 the general public?  I know it's not a silver bullet, but I think in
 general, it will be helpful for most people.

As I said, this is largely the fault of VACUUM. The main thing I'd like to
see is a complete solution to the problem. I'm not picking on autovacuum.
However, I will elaborate a little on why I think autovacuum should not
be a feature of the backend:

1) The main argument so far is that autovacuum will ensure that users who
do not read the maintenance section of the manual will not notice a
deterioration of performance. This means that we anticipate autovacuum
being on by default. This suggests that the default autovacuum
configuration will not need tuning. I do not think that will be the case.

2) By no fault of its own, autovacuum's level of granularity is the table
level. For people dealing with non-trivial amounts of data (and we're not
talking gigabytes or terabytes here), this is a serious drawback. Vacuum
at peak times can cause very intense IO bursts -- even with the
enhancements in 8.0. I don't think the solution to the problem is to give
users the impression that it is solved and then vacuum their tables during
peak periods. I cannot stress this enough.

3) autovacuum on by default means row level stats are on by default. This
will have a non-trivial performance impact on users, IMHO. For right or
wrong, our users take the postgresql.conf defaults pretty seriously and
this level of stats collection could and will remain enabled in some
non-trivial percentage of users who turn autovacuum off (consider many
users' reluctance to change shared_buffers in previous releases). To quote
from the README:

The overhead of the stats system has been shown to be significant under
certain workloads.  For instance, a tight loop of queries performing
select 1 was found to run nearly 30% slower when row-level stats were
enabled.

I'm not one for select 1 benchmarks but this is a problem that hasn't
even been mentioned, as far as I recall.

4) Related to this, I guess, is that a user's FSM settings might be
completely inappropriate. The 'Just read the manual' or 'Just read the
logs' argument doesn't cut it, because the main argument for autovacuum in
the backend is that people do not and will not.

5) It doesn't actually shrink tables -- ie, there's no VACUUM FULL. If
we're telling users about VACUUM less often than we are now, there's bound
to be bloating issues (see 4).

I guess the main point is, if something major like this ships in the
backend it says to users that the problem has gone away. pg_autovacuum is
a good contrib style solution: it addresses a problem users have and
attempts to solve it the way other users might try and solve it. When you
consider it in the backend, it looks like a workaround. I think users are
better served by solving the real problem.

Gavin

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] [GENERAL] INHERITS and planning

2005-06-16 Thread Simon Riggs
On Thu, 2005-06-16 at 01:10 -0400, Greg Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
 
  If you really do need that many, you can go to the trouble of grouping
  them in two levels of nesting, so you have a root table, multiple month
  tables and then each month table with multiple day tables (etc).
 
 I wonder if testing deeply nested inheritance graphs would show up an entirely
 different set of problem areas.

I'm not sure two or three levels is deeply nested, but I suspect you
are correct.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] [GENERAL] INHERITS and planning

2005-06-16 Thread Simon Riggs
On Thu, 2005-06-16 at 12:59 +0800, Christopher Kings-Lynne wrote:
  Well, it's not so much that I care about queries with 1000+ relations,
  as that this is a good way to stress-test the code and find out where
  the performance issues are.  There are many thousand lines of code that
  can never be performance-sensitive, but to expose the ones that are
  it helps to push the envelope a bit.
 
 Once we have partitioning and people set up automated scripts to 
 partition off stuff, we may well end up with 1000+ table queries...

I can see why you think that, but there will always be pressure to
reduce the number of partitions for a variety of reasons. IMHO that will
lead to an optimum range of values.

To me, it seems likely there would be a recommendation along the lines
of: divide the table up naturally in a way that gives between 10 and 500
partitions that are mostly roughly equally sized.

Using more than that could lead to some fairly strange designs.

Anyway, lets wait and see.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[HACKERS] PROPOSAL - User's exception in PL/pgSQL

2005-06-16 Thread Pavel Stehule
Hello

I did some work on implementation of user's exception. 

Generally:

 o add pseudotype EXCEPTION 
 
DECLARE excpt EXCEPTION [= 'SQLSTATE']

 o change RAISE stmt

RAISE error_level [excpt_var|sys_excpt_name] errmsg, ...

 o change EXCEPTION

EXCEPTION WHEN excpt_var|sys_excpt_name THEN ...

Rules:
 o User can specify SQLSTATE only from class 'U1' 
 o Default values for SQLSTATE usr excpt are from class 'U0'
 o Every exception's variable has unique SQLSTATE
 o User's exception or system's exception can be raised only with
   level EXCEPTION

Any comments, notes?

Regards
Pavel Stehule

Regres test:

create function innerfx() returns integer as $$
declare my_excpt exception = 'U0001';
begin -- using msgtext as one param of exception
  raise exception my_excpt '%', CURRENT_TIMESTAMP;
  return 1;
end $$ language plpgsql;
psql:regres.sql:6: ERROR:  Invalid class for SQLSTATE value 'U0001' for 
user's exception.
HINT:  Select any unoccupied value from class U1 which is reserved for 
user's exception.
CONTEXT:  compile of PL/pgSQL function innerfx near line 1
create function innerfx() returns integer as $$
declare
  my_excpt exception = 'U1001';
  my_sec_excpt exception = 'U1001';
begin -- using msgtext as one param of exception
  raise exception my_excpt '%', CURRENT_TIMESTAMP;
  return 1;
end $$ language plpgsql;
psql:regres.sql:15: ERROR:  Invalid SQLSTATE value 'U1001' for user's 
exception.
HINT:  Select any unoccupied value from class U1 which is reserved for 
user's exception.
CONTEXT:  compile of PL/pgSQL function innerfx near line 3
create function innerfx() returns integer as $$
declare my_excpt exception = 'U1001';
begin -- using msgtext as one param of exception
  raise exception my_excpt '%', CURRENT_TIMESTAMP;
  return 1;
end $$ language plpgsql;
CREATE FUNCTION
create function outerfx() returns integer as $$
declare
  my_excpt exception = 'U1001';
  alias_div_by_zero exception = 'U1002';
  my_excpt_def_sqlstate exception;
begin
  begin
raise exception my_excpt_def_sqlstate 'foo';
  exception when my_excpt_def_sqlstate then
raise notice '01 catch: %, %', sqlstate, sqlerrm;
  end;
  begin
raise notice '%', innerfx();
  exception when my_excpt then
raise notice '02 catch: %, %', sqlstate, sqlerrm::timestamp;
  end;
  begin
raise exception division_by_zero 'testing';
  exception when division_by_zero then
raise notice 'Divison by zero: %, %', sqlstate, sqlerrm;
  end;
  raise exception alias_div_by_zero 'Unhandled exception';
  return 1;
end; $$ language plpgsql;
CREATE FUNCTION
select innerfx();
psql:regres.sql:50: ERROR:  2005-06-16 10:12:53.27408+02
DETAIL:  User's exception/notice - sqlstate: U1001, name: my_excpt
HINT:  from RAISE stmt on line 3
select outerfx();
psql:regres.sql:51: NOTICE:  01 catch: U0001, foo
psql:regres.sql:51: NOTICE:  02 catch: U1001, 2005-06-16 10:12:53.274656
psql:regres.sql:51: NOTICE:  Divison by zero: 22012, testing
psql:regres.sql:51: ERROR:  Unhandled exception
DETAIL:  User's exception/notice - sqlstate: U1002, name: 
alias_div_by_zero
HINT:  from RAISE stmt on line 21
drop function outerfx();
DROP FUNCTION
drop function innerfx();
DROP FUNCTION




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Hans-Jürgen Schönig

2) By no fault of its own, autovacuum's level of granularity is the table
level. For people dealing with non-trivial amounts of data (and we're not
talking gigabytes or terabytes here), this is a serious drawback. Vacuum
at peak times can cause very intense IO bursts -- even with the
enhancements in 8.0. I don't think the solution to the problem is to give
users the impression that it is solved and then vacuum their tables during
peak periods. I cannot stress this enough.



I completly agree with Gavin - integrating this kind of thing into the 
backend writer or integrate it with FSM would be the ideal solution.


I guess everybody who has already vacuumed a 2 TB relation will agree 
here. VACUUM is not a problem for small my cat Minka databases. 
However, it has been a real problem on large, heavy-load databases. I 
have even seen people splitting large tables and join them with a view 
to avoid long vacuums and long CREATE INDEX operations (i am not joking 
- this is serious).


postgresql is more an more used to really large boxes. this is an 
increasing problem. gavin's approach using a vacuum bitmap seems to be a 
good approach. an alternative would be to have some sort of vacuum queue 
containing a set of pages which are reported by the writing process (= 
backend writer or backends).


best regards,

hans

--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/664/393 39 74
www.cybertec.at, www.postgresql.at


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Magnus Hagander
 If we do integrated AV, it should only be turned on by 
 default at a relatively low level.  And wasn't there an issue 
 on Windows with AV not working?

AFAIK, it works.
But the fact that you need to provide it with a userid/password combo
makes it a lot harder to install as a service. And it's not installed by
default by the installer, for that reason (and I think maybe others?
Dave, I think you're the one who said no-service-by-default?)

//Magnus

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Gavin Sherry
On Thu, 16 Jun 2005, [ISO-8859-1] Hans-Jürgen Schönig wrote:

  2) By no fault of its own, autovacuum's level of granularity is the table
  level. For people dealing with non-trivial amounts of data (and we're not
  talking gigabytes or terabytes here), this is a serious drawback. Vacuum
  at peak times can cause very intense IO bursts -- even with the
  enhancements in 8.0. I don't think the solution to the problem is to give
  users the impression that it is solved and then vacuum their tables during
  peak periods. I cannot stress this enough.


 I completly agree with Gavin - integrating this kind of thing into the
 backend writer or integrate it with FSM would be the ideal solution.

 I guess everybody who has already vacuumed a 2 TB relation will agree
 here. VACUUM is not a problem for small my cat Minka databases.
 However, it has been a real problem on large, heavy-load databases. I
 have even seen people splitting large tables and join them with a view
 to avoid long vacuums and long CREATE INDEX operations (i am not joking
 - this is serious).

I think this gets away from my point a little. People with 2 TB tables can
take care of themselves, as can people who've taken the time to partition
their tables to speed up vacuum. I'm more concerned about the majority of
people who fall in the middle -- between the hobbiest and the high end
data centre.

Thanks,

Gavin

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Hans-Jrgen Schnig

Gavin Sherry wrote:

On Thu, 16 Jun 2005, [ISO-8859-1] Hans-Jrgen Schnig wrote:



2) By no fault of its own, autovacuum's level of granularity is the table
level. For people dealing with non-trivial amounts of data (and we're not
talking gigabytes or terabytes here), this is a serious drawback. Vacuum
at peak times can cause very intense IO bursts -- even with the
enhancements in 8.0. I don't think the solution to the problem is to give
users the impression that it is solved and then vacuum their tables during
peak periods. I cannot stress this enough.



I completly agree with Gavin - integrating this kind of thing into the
backend writer or integrate it with FSM would be the ideal solution.

I guess everybody who has already vacuumed a 2 TB relation will agree
here. VACUUM is not a problem for small my cat Minka databases.
However, it has been a real problem on large, heavy-load databases. I
have even seen people splitting large tables and join them with a view
to avoid long vacuums and long CREATE INDEX operations (i am not joking
- this is serious).



I think this gets away from my point a little. People with 2 TB tables can
take care of themselves, as can people who've taken the time to partition
their tables to speed up vacuum. I'm more concerned about the majority of
people who fall in the middle -- between the hobbiest and the high end
data centre.

Thanks,

Gavin



I think your approach will help all of them.
If we had some sort of autovacuum (which is packages with most distros 
anyway - having it in the core is nice as well) and a mechanism to 
improve realloaction / vacuum speed we have solved all problems.


i do think that 2 tb can take care of themselves. the question is, 
however, whether the database can do what they want ...


thanks a lot,

hans

--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/664/393 39 74
www.cybertec.at, www.postgresql.at


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] [GENERAL] INHERITS and planning

2005-06-16 Thread Simon Riggs
On Thu, 2005-06-16 at 00:55 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Looks bad... but how does it look for 1000 inherited relations? My
  feeling is that we should not be optimizing the case above 1000
  relations. That many partitions is very unwieldy.
 
 Well, it's not so much that I care about queries with 1000+ relations,
 as that this is a good way to stress-test the code and find out where
 the performance issues are.  There are many thousand lines of code that
 can never be performance-sensitive, but to expose the ones that are
 it helps to push the envelope a bit.

I very much agree and I also appreciate you taking the time to look into
this since it clearly has an effect on the usefulness of partitioning. I
wanted to give my opinion that more than 1000 is not a frequent use
case.

 Until Neil fixed the list.c package in 8.0, we had pretty much zero
 chance of avoiding O(N^2) or worse behavior on almost any measure of
 query size N that you cared to name; because most of the internal data
 structures depend on lists.  (You do know that Postgres was once written
 in Lisp, right?)  Now that that basic issue is taken care of, it's worth
 looking at secondary bad behaviors ... I've been doing some hacking in
 this area lately, but it's not all fixed yet.

Yes, know about that; I agree with the general principles you discuss.

Your suggested fix to the 2000+ inherited relation problem seemed like
it would apply to an area that most people would never use, yet would
have an effect on anybody using LockAcquire. IMHO that is not worth the
effort or risk, and that is from somebody that you know has been
involved in tracking down O(N^2) behaviour in other parts of the code.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Dave Page
 

 -Original Message-
 From: Magnus Hagander [mailto:[EMAIL PROTECTED] 
 Sent: 16 June 2005 10:15
 To: Josh Berkus; Gavin Sherry
 Cc: Bruce Momjian; pgsql-hackers@postgresql.org; Dave Page
 Subject: RE: [HACKERS] Autovacuum in the backend
 
  If we do integrated AV, it should only be turned on by 
  default at a relatively low level.  And wasn't there an issue 
  on Windows with AV not working?
 
 AFAIK, it works.
 But the fact that you need to provide it with a userid/password combo
 makes it a lot harder to install as a service. And it's not 
 installed by
 default by the installer, for that reason (and I think maybe others?
 Dave, I think you're the one who said no-service-by-default?)

Yes, 'cos there was no easy way to do it securely when I did it without
a lot of mucking about to setup a .pgpass file in the service users
account.

It does work perfectly well however, and did so long before PostgreSQL
itself did :-)

Regards, Dave

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Andrew Dunstan
Gavin Sherry said:
 On Thu, 16 Jun 2005, [ISO-8859-1] Hans-Jrgen Schnig wrote:

  2) By no fault of its own, autovacuum's level of granularity is the
  table level. For people dealing with non-trivial amounts of data
  (and we're not talking gigabytes or terabytes here), this is a
  serious drawback. Vacuum at peak times can cause very intense IO
  bursts -- even with the enhancements in 8.0. I don't think the
  solution to the problem is to give users the impression that it is
  solved and then vacuum their tables during peak periods. I cannot
  stress this enough.


 I completly agree with Gavin - integrating this kind of thing into the
 backend writer or integrate it with FSM would be the ideal solution.

 I guess everybody who has already vacuumed a 2 TB relation will agree
 here. VACUUM is not a problem for small my cat Minka databases.
 However, it has been a real problem on large, heavy-load databases. I
 have even seen people splitting large tables and join them with a view
 to avoid long vacuums and long CREATE INDEX operations (i am not
 joking - this is serious).

 I think this gets away from my point a little. People with 2 TB tables
 can take care of themselves, as can people who've taken the time to
 partition their tables to speed up vacuum. I'm more concerned about the
 majority of people who fall in the middle -- between the hobbiest and
 the high end data centre.


My only problemn with what you say is that we should not incorporate AV into
the backend until these things have been solved. This would be one step down
a long raod, and that's how it should be positioned.

I am very concerned that with Feature Freeze 2 weeks away we seem to be in a
similar position to where we were a year ago. I know we don't even promise
anything, but certainly I and others believed that work was being done to
get AV into the backend in 8.1. Not doing this because we think it could be
lots better would not give people a good impression of our processes. I
certainly don't think it will make matters worse, especially if it's not on
by default.

cheers

andrew



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [GENERAL] INHERITS and planning

2005-06-16 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Your suggested fix to the 2000+ inherited relation problem seemed like
 it would apply to an area that most people would never use, yet would
 have an effect on anybody using LockAcquire.

Just FYI, the proposed fix is already in, and I think it's a net win for
anyone.  LockCountMyLocks was really an artifact of a lock API that's
been superseded by events --- namely the assumption that we want to take
locks in the names of particular transactions rather than in the names
of particular backends.  I put that in around 7.1 or so, primarily to
support session locks for VACUUM, but designed it the way I did with
the idea that subtransactions would someday want it.  In the event,
subtransactions didn't want it --- it was a lot cheaper to add the
backend-private LOCALLOCK tables and make all the subtransaction
bookkeeping happen internally to a backend.  Now that we have LOCALLOCK
the obvious next step is to manage session locks entirely within
LOCALLOCK too, and reduce the shared-memory state to essentially one bit
per lock per backend: I hold it or I don't hold it.  When you know
there is only one proclock per backend, there's no need to search for
other ones and thus LockCountMyLocks goes away again.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Matthew T. O'Connor

Andrew Dunstan wrote:


Gavin Sherry said:
 


I think this gets away from my point a little. People with 2 TB tables
can take care of themselves, as can people who've taken the time to
partition their tables to speed up vacuum. I'm more concerned about the
majority of people who fall in the middle -- between the hobbiest and
the high end data centre.

   



My only problemn with what you say is that we should not incorporate AV into
the backend until these things have been solved. This would be one step down
a long raod, and that's how it should be positioned.
 



Right, I think if VACUUM is improved than the semantics of AV in the 
backend might change, but I think there will always be a need for some 
maintenance, and a daemon that monitors the maintenance needs of your 
database and fires off appropriate maintenance commands for you is 
good.  No it doesn't solve all problems, but I think it solves a lot of 
problems for a lot of people.  Besides VACUUM isn't the only the AV 
does, it also does ANALYZE to keep your stats up-to-date and it watches 
for XID wraparound.  It could also look for REINDEX opportunities and 
who knows what else in the future.



I am very concerned that with Feature Freeze 2 weeks away we seem to be in a
similar position to where we were a year ago. I know we don't even promise
anything, but certainly I and others believed that work was being done to
get AV into the backend in 8.1. Not doing this because we think it could be
lots better would not give people a good impression of our processes. I
certainly don't think it will make matters worse, especially if it's not on
by default.
 



I agree.  Also, some people in this thread have been making noises about 
wanting AV on by default.  This might be nice, but I am still leaning 
towards off by default at least in 8.1.




---(end of broadcast)---
TIP 6: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Matthew T. O'Connor

Gavin Sherry wrote:


On Thu, 16 Jun 2005, Matthew T. O'Connor wrote:
 


Would you mind expounding on why you think autovacuum isn't suitable for
the general public?  I know it's not a silver bullet, but I think in
general, it will be helpful for most people.
   



As I said, this is largely the fault of VACUUM. The main thing I'd like to
see is a complete solution to the problem. I'm not picking on autovacuum.
However, I will elaborate a little on why I think autovacuum should not
be a feature of the backend:
 



Don't worry, I don't think you are picking on AV.


1) The main argument so far is that autovacuum will ensure that users who
do not read the maintenance section of the manual will not notice a
deterioration of performance. This means that we anticipate autovacuum
being on by default. This suggests that the default autovacuum
configuration will not need tuning. I do not think that will be the case.
 



I disagree with this.   I think the newbie protection benefits of AV are 
not it's primary goal, though I do think it's an important one.  The 
main thing AV brings is the ability to control bloating in your database 
and keep your stats up-to-date no matter what your work load.  It is 
possible for an Admin to setup cron scripts to run VACUUM or ANALYZE on 
particularly needy tables at appropriate intervals, but I guarantee that 
the cron script is going to either fire too many, or too few VACUUMS.  
Also when the workload changes, or a new table is added, the Admin then 
needs to update his cron scripts.  This all goes away with AV and I 
believe this is a much bigger goal than the newbie problem.



2) By no fault of its own, autovacuum's level of granularity is the table
level. For people dealing with non-trivial amounts of data (and we're not
talking gigabytes or terabytes here), this is a serious drawback. Vacuum
at peak times can cause very intense IO bursts -- even with the
enhancements in 8.0. I don't think the solution to the problem is to give
users the impression that it is solved and then vacuum their tables during
peak periods. I cannot stress this enough.
 



I agree this is a major problem with VACUUM, but I also think it's a 
different problem.  One advantage of integrated AV is that you will be 
able to set per-table thresholds, which include the ability to turn off 
AV for any given table.  If you are running a database with tables this 
big, I think you will be able to figure out how to customize integrated 
AV to your needs.



3) autovacuum on by default means row level stats are on by default. This
will have a non-trivial performance impact on users, IMHO. For right or
wrong, our users take the postgresql.conf defaults pretty seriously and
this level of stats collection could and will remain enabled in some
non-trivial percentage of users who turn autovacuum off (consider many
users' reluctance to change shared_buffers in previous releases). To quote
from the README:

The overhead of the stats system has been shown to be significant under
certain workloads.  For instance, a tight loop of queries performing
select 1 was found to run nearly 30% slower when row-level stats were
enabled.

I'm not one for select 1 benchmarks but this is a problem that hasn't
even been mentioned, as far as I recall.
 



I mentioned this in the README because I thought I should, not because I 
think it's a real problem in practice.  I think a real production 
database doing queries that are any more complicated than select 1 
will probably not notice the difference.




4) Related to this, I guess, is that a user's FSM settings might be
completely inappropriate. The 'Just read the manual' or 'Just read the
logs' argument doesn't cut it, because the main argument for autovacuum in
the backend is that people do not and will not.
 



Agreed, it doesn't solve all problems, and I'm not arguing that the 
integration of AV makes PostgreSQL newbie safe it just helps reduce the 
newbie problem.   Again if the default FSM settings are inappropriate 
for a database then the user is probably doing something more 
complicated that a my cat minka database and will need to learn some 
tuning skills anyway.



5) It doesn't actually shrink tables -- ie, there's no VACUUM FULL. If
we're telling users about VACUUM less often than we are now, there's bound
to be bloating issues (see 4).
 



Not totally true, regular VACUUM can shrink tables a little (I think 
only if there is free space at the end of the table it can cutoff 
without moving data around).  But if AV is on and the settings are 
reasonable, then a table shouldn't bloat much or at all.  Also, I don't 
think we are telling people to VACUUM less, in fact tables that need it 
will usually get VACUUM'd more, we are just telling the users that if 
they turn AV on, they don't have to manage all the VACUUMing.



I guess the main point is, if something major like this ships in the
backend it says to users that the problem has gone away. pg_autovacuum is

Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Matthew T. O'Connor

Hans-Jrgen Schnig wrote:

I completly agree with Gavin - integrating this kind of thing into the 
backend writer or integrate it with FSM would be the ideal solution.



Yes AV should look at FSM data, and it will eventually.  I'm not sure 
how you would integrate AV with the backend writer, but again if 
improvements are made to vacuum, AV might have to change along with it, 
but I still think it will be needed or at least helpful.




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Steve Atkins
On Wed, Jun 15, 2005 at 09:07:30PM -0700, Josh Berkus wrote:
 Josh,
 
  Just my own two cents. First I am not knocking the work that has been on
  autovacuum. I am sure that it was a leap on its own to get it to work.
  However I will say that I just don't see the reason for it.
 
 I've personally seen at least a dozen user requests for autovacuum in the 
 backend, and had this conversation about 1,100 times:
 
 NB: After a week, my database got really slow.
 Me: How often are you running VACUUM ANALYZE?
 NB: Running what?

Yes, me too.

I always understood autovacuum to be a way to avoid having newbies get
burned by not vacuuming, and for simplifying the maintenance of lower
traffic databases.

I don't see people with high-traffic databases (relative to the hardware
they're running on) ever using autovacuum with the current state of
vacuum and autovacuum.

If improvements to vacuum (unrelated to autovacuum) reduce the IO load
that would be a great thing, especially for those of us dealing with
24x7 databases. (I really like the dirty bitmap suggestion - it sounds
a clean way to reduce the amount of work needed). If autovacuum were
extended to allow more flexible scheduling (or even to be aware of the
other IO going on) then it would be of wider use - but I think the real
value of autovacuum is to make sure that new users (Windows...) don't
have a bad experience when they first try PG.

Cheers,
  Steve

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Andreas Pflug

Matthew T. O'Connor wrote:



Right, I think if VACUUM is improved than the semantics of AV in the 
backend might change, but I think there will always be a need for some 
maintenance, and a daemon that monitors the maintenance needs of your 
database and fires off appropriate maintenance commands for you is 
good.  No it doesn't solve all problems, but I think it solves a lot of 
problems for a lot of people.  Besides VACUUM isn't the only the AV 
does, it also does ANALYZE to keep your stats up-to-date and it watches 
for XID wraparound.  It could also look for REINDEX opportunities and 
who knows what else in the future.


Dave,

i wonder if we should aim to have pgAgent in the backend which was one 
of the reasons why I considered to have it converted from C++ to pure C.


There are many regular maintenance issues that AV can cover, some more 
it could cover and many more we can't even think of right now. Having an 
sql executing agent freely at hand (consistent on _every_ platform, 
without cron/anacron/at/younameit dependencies) should be helpful for that.


Regards,
Andreas

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Alvaro Herrera
On Thu, Jun 16, 2005 at 01:32:16AM -0400, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  A question for interested parties.  I'm thinking in handling the
  user/password issue by reading the flat files (the copies of pg_shadow,
  pg_database, etc).
 
 Er, what user/password issue?  Context please.
 
  The only thing that I'd need to modify is add the datdba field to
  pg_database, so we can figure out an appropiate user for vacuuming each
  database.
 
 The datdba is not necessarily a superuser, and therefore is absolutely
 not the right answer for any question related to autovacuum.  But in
 any case, I would expect that an integrated-into-the-backend autovac
 implementation would be operating at a level below any permission checks
 --- so this question shouldn't be relevant anyway.

Ok, seems things are quite a bit out of context.  What I did was take
Matthew's patch for integrating contrib pg_autovacuum into the
postmaster.  This patch was posted several times as of July and August
2004.  This patch had several issues, like an incorrect shutdown
sequence, forcing libpq to be statically linked into the backend, not
correctly using ereport(), not using the backend's memory management
infrastructure.

There were several suggestions.  One was to separate it in two parts,
one which would be a process launched by postmaster, and another which
would be a shared library, loaded by that other process, which would in
turn load libpq and issue SQL queries (including but not limited to
VACUUM and ANALYZE queries) to a regular backend, using a regular
connection.

Now, the user/password issue is which user and password combination is
used to connect to the regular backend.  Matthew had created a password
file, to be used in a similar fashion to libpq's password file.  This
works but has the drawback that the user has to set the file correctly.
What I'm proposing is using the flatfiles for this.


Now, I'm hearing people don't like using libpq.  This means the whole
thing turn a lot more complicated; for one thing, because it will need
to connect to every database in some fashion.  Also, you want it to
skip normal permission checks, which would be doable only if it's not
using libpq.  On the other hand, if there were multiple autovacuum
processes, one per database, it'd be all much easier, without using
libpq.

Could we clarify what scenario is people envisioning?  I don't want to
waste time fixing code that in the end is going to be declared as
fundamentally flawed -- I'd rather work on shared dependencies.

Some people say keep it simple and have one process per cluster.  I
think they don't realize it's actually more complex, not the other way
around.  The only additional complexity is how to handle concurrent
vacuuming, but the code turns out to be simpler because we have access
to system catalogs and standard backend infrastructure in a simple
fashion.



A wholly separate approach is what should the autovacuum daemon be
doing.  At present we only have full vacuum, vacuum and analyze.
In the future this can be extended and autovacuum can launch partial
vacuums, nappy vacuums, bitmapped vacuums, coffee-with-cream vacuums.
But we need to start somewhere.

-- 
Alvaro Herrera (alvherre[a]surnet.cl)
Qu importan los aos?  Lo que realmente importa es comprobar que
a fin de cuentas la mejor edad de la vida es estar vivo  (Mafalda)

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Alvaro Herrera
On Thu, Jun 16, 2005 at 04:20:34PM +1000, Gavin Sherry wrote:

 2) By no fault of its own, autovacuum's level of granularity is the table
 level. For people dealing with non-trivial amounts of data (and we're not
 talking gigabytes or terabytes here), this is a serious drawback. Vacuum
 at peak times can cause very intense IO bursts -- even with the
 enhancements in 8.0. I don't think the solution to the problem is to give
 users the impression that it is solved and then vacuum their tables during
 peak periods. I cannot stress this enough.

People running systems with petabyte-sized tables can disable autovacuum
for those tables, and leave it running for the rest.  Then they can
schedule whatever maintenance they see fit on their gigantic tables.
Trying to run a database with more than a dozen gigabytes of data
without expert advice (or at least reading the manual) would be
extremely stupid anyway.

-- 
Alvaro Herrera (alvherre[a]surnet.cl)
Cmo puedes confiar en algo que pagas y que no ves,
y no confiar en algo que te dan y te lo muestran? (Germn Poo)

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] [PATCHES] Escape handling in strings

2005-06-16 Thread Andrew Dunstan


[switched to -hackers]

Tom Lane wrote:


Rod Taylor [EMAIL PROTECTED] writes:
 


It probably won't be any worse than when '' was rejected for an integer
0.
   



That analogy is *SO* far off the mark that I have to object.

Fooling with quoting rules will not simply cause clean failures, which
is what you got from ''-no-longer-accepted-by-atoi.  What it will cause
is formerly valid input being silently interpreted as something else.
That's bad enough, but it gets worse: formerly secure client code may
now be vulnerable to SQL-injection attacks, because it doesn't know how
to quote text properly.

What we are talking about here is an extremely significant change with
extremely serious consequences, and imagining that it is not will be
a recipe for disaster.


 

All true. Conversely, there does need to be a path for us to get to 
standard behaviour.


I think we're going to need to provide for switchable behaviour, as ugly 
as that might be (looking briefly at scan.l it looks like the simplest 
way would be a separate state for being inside standard strings, with 
the choice of state being made conditionally in the {xqstart} rule).


We can't just break backwards compatibility overnight like this.

cheers

andrew

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Hans-Jürgen Schönig

Alvaro Herrera wrote:

On Thu, Jun 16, 2005 at 04:20:34PM +1000, Gavin Sherry wrote:



2) By no fault of its own, autovacuum's level of granularity is the table
level. For people dealing with non-trivial amounts of data (and we're not
talking gigabytes or terabytes here), this is a serious drawback. Vacuum
at peak times can cause very intense IO bursts -- even with the
enhancements in 8.0. I don't think the solution to the problem is to give
users the impression that it is solved and then vacuum their tables during
peak periods. I cannot stress this enough.



People running systems with petabyte-sized tables can disable autovacuum
for those tables, and leave it running for the rest.  Then they can
schedule whatever maintenance they see fit on their gigantic tables.
Trying to run a database with more than a dozen gigabytes of data
without expert advice (or at least reading the manual) would be
extremely stupid anyway.




professional advice won't help you here because you still have to vacuum 
this giant table. this is especially critical in case of 24x7 systems 
(which are quite frequent). in many cases there is no maintenance window 
anymore (e.g. a wastewater system will be only 24x7).


reducing the impact of vacuum and create index would be important to 
many people. to me improving vacuum it is as important as Jan's bgwriter 
patch (it reduces the troubles people had with checkpoints).


best regards,

hans

--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/664/393 39 74
www.cybertec.at, www.postgresql.at


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Now, I'm hearing people don't like using libpq.

Yeah --- a libpq-based solution is not what I think of as integrated at
all, because it cannot do anything that couldn't be done by the existing
external autovacuum process.  About all you can buy there is having the
postmaster spawn the autovacuum process, which is slightly more
convenient to use but doesn't buy any real new functionality.

 Some people say keep it simple and have one process per cluster.  I
 think they don't realize it's actually more complex, not the other way
 around.

Agreed.  If you aren't connected to a specific database, then you cannot
use any of the normal backend infrastructure for catalog access, which
is pretty much a killer limitation.

A simple approach would be a persistent autovac background process for
each database, but I don't think that's likely to be acceptable because
of the amount of resources tied up (PGPROC slots, open files, etc).

One thing that might work is to have the postmaster spawn an autovac
process every so often.  The first thing the autovac child does is pick
up the current statistics dump file (which it can find without being
connected to any particular database).  It looks through that to
determine which database is most in need of work, then connects to that
database and does some reasonable amount of work there, and finally
quits.  Awhile later the postmaster spawns another autovac process that
can connect to a different database and do work there.

This design would mean that the autovac process could not have any
long-term state of its own: any long-term state would have to be in
either system catalogs or the statistics.  But I don't see that as
a bad thing really --- exposing the state will be helpful from a
debugging and administrative standpoint.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] Escape handling in strings

2005-06-16 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 All true. Conversely, there does need to be a path for us to get to 
 standard behaviour.

Yes --- but the important word there is path.  I think we have to do
this in stages over a number of releases, to give people time to
migrate.

Assuming that the end result we want to get to is:
1. Plain '...' literals are per SQL spec: '' for embedded
   quotes, backslashes are not special.
2. We add a construct E'...' that handles backslash escapes
   the same way '...' literals do today.

I think what would be reasonable for 8.1 is to create the E'...'
construct --- which will not cause any backwards compatibility issues
that I can see --- document it and encourage people to migrate,
and start throwing warnings about use of \' in non-E literals.
(We could have a GUC variable to suppress the warnings; I'm of
the opinion that it would be better not to, though, because the point
is to get people out of that habit sooner rather than later.)

I would be inclined to leave things like that for a couple of release
cycles before we disable backslashes in regular literals.  By the time
we do that, we should have at least flushed out the cases where
disabling backslashes will create security holes.

 I think we're going to need to provide for switchable behaviour, as ugly 
 as that might be (looking briefly at scan.l it looks like the simplest 
 way would be a separate state for being inside standard strings, with 
 the choice of state being made conditionally in the {xqstart} rule).

I really really dislike that idea; it is a recipe for creating problems
not solving them.

The hard part in all this is to create apps that will survive the
transition gracefully.  I think the only way for that is to implement
a reporting feature that lets the app know whether backslahes are
special in plain literals or not.  We already have the mechanism for
that, ie read-only GUC variables with GUC_REPORT enabled (which we use
for integer datetimes, for instance).  But I really believe it is
important that this be a *read only* thing not something that can be
flipped around at runtime.  Anyway, the reporting variable is another
thing that should appear in 8.1.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Dave Page
 

 -Original Message-
 From: Andreas Pflug [mailto:[EMAIL PROTECTED] 
 Sent: 16 June 2005 15:14
 To: Dave Page
 Cc: Matthew T. O'Connor; Andrew Dunstan; 
 [EMAIL PROTECTED]; [EMAIL PROTECTED]; 
 [EMAIL PROTECTED]; pgman@candle.pha.pa.us; 
 pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Autovacuum in the backend
 
 Dave,
 
 i wonder if we should aim to have pgAgent in the backend 
 which was one 
 of the reasons why I considered to have it converted from C++ 
 to pure C.

In previous discussions on -hackers when ppl raised the idea of
something like pgAgent being built into the backend, istm that the
majority of people were against the idea. 

Regards, Dave.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] PROPOSAL - User's exception in PL/pgSQL

2005-06-16 Thread Josh Berkus
Pavel,

 o User can specify SQLSTATE only from class 'U1'
 o Default values for SQLSTATE usr excpt are from class 'U0'
 o Every exception's variable has unique SQLSTATE
 o User's exception or system's exception can be raised only with
  level EXCEPTION

 Any comments, notes?

Looks great to me, pending a code examination.   Will it also be possible to 
query the SQLSTATE/ERRSTRING  inside the EXCEPTION clause?   i.e.

WHEN OTHERS THEN
RAISE NOTICE '%',sqlstate;
ROLLBACK;

That's something missing from 8.0 exception handling that makes it hard to 
improve SPs with better error messages.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Josh Berkus
Alvaro,

 coffee-with-cream vacuums.

I tried this and now my Hoover makes this horrible noise and smokes.  ;-)

All:

Seriously, all:  when I said that users were asking for Autovac in the 
backend (AVitB), I wasn't talking just the newbies on #postgresql.   I'm also 
talking companies like Hyperic, and whole groups like the postgresql.org.br.   
This is a feature that people want, and unless there's something 
fundamentally unstable about it, it seems really stupid to hold it back 
because we're planning VACUUM improvements for 8.2.

AVitB has been on the TODO list for 2 versions.   There's been 2 years to 
question its position there.   Now people are bringing up objections when 
there's no time for discussion left?  This stinks.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Josh Berkus
People,

 AVitB has been on the TODO list for 2 versions.  There's been 2 years to
 question its position there.  Now people are bringing up objections when
 there's no time for discussion left? This stinks.

Hmmm ... to be specific, I'm referring to the objections to the *idea* of 
AVitB, not the problems with the current patch.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Josh Berkus
Dave,

 In previous discussions on -hackers when ppl raised the idea of
 something like pgAgent being built into the backend, istm that the
 majority of people were against the idea.

Well, you're up against the minimalist approach to core PostgreSQL there.  It 
would pretty much *have* to be an optional add-in, even if it was stored in 
pg_catalog.  I can see a lot of uses for a back-end job scheduler myself, but 
it would need to go through the gauntlet of design criticism first wry 
grin.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Proposal - Continue stmt for PL/pgSQL

2005-06-16 Thread Pavel Stehule
Hello

   Statement CONTINUE isn't in PL/SQL too, I know it, but Oracle PL/SQL 
has statement GOTO. I don't need GOTO statement, but 'continue' can be 
very usefull for me. I have to do some ugly trick now. With little change, 
we can enhance stmt EXIT for behavior continue.

After some work I can

CREATE OR REPLACE FUNCTION lll() RETURNS void AS $$
DECLARE i integer = 0;
BEGIN
  LOOP
i = i + 1;
CONTINUE WHEN i  10;
RAISE NOTICE '%', i;
EXIT;
  END LOOP;
  BEGIN
CONTINUE WHEN i = 10;
RAISE NOTICE '---1---';
  END;
  RAISE NOTICE '---2---';
  FOR _i IN 1 .. 10 LOOP
CONTINUE WHEN _i  5;
RAISE NOTICE '%', _i;
  END LOOP;
END; $$ LANGUAGE plpgsql;
select lll();

pokus=# NOTICE:  10
NOTICE:  ---2---
NOTICE:  5
NOTICE:  6
NOTICE:  7
NOTICE:  8
NOTICE:  9
NOTICE:  10
 lll
-

(1 row)

What do you think about it? It's broke PL/SQL compatibility, I know, but 
via last discussion I have opinion so Oracle compatibility isn't main 
objective PL/pgSQL. There is some less/bigger diferencess: SQLSTATE, 
EXCEPTION from my last proposal, atd.

What do you think about it?

Regards
Pavel Stehule



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Douglas McNaught
Josh Berkus josh@agliodbs.com writes:

 Seriously, all:  when I said that users were asking for Autovac in the 
 backend (AVitB), I wasn't talking just the newbies on #postgresql.   I'm also 
 talking companies like Hyperic, and whole groups like the postgresql.org.br.  
  
 This is a feature that people want, and unless there's something 
 fundamentally unstable about it, it seems really stupid to hold it back 
 because we're planning VACUUM improvements for 8.2.

Agreed, and I don't see AVitB as standing in the way of any of those
proposed improvements--it's just that AVitB has a chance of making it
into 8.1, and none of the proposed improvements do.  I don't see why
people are objecting.

Also, count me in the turn it on by default crowd--I'd rather not
have newbies see unending file bloat from normal usage, it just looks
bad.  Anyone who plans to deploy for large databases and high loads
needs to learn to tune (just as with any other database) and can make
an informed decision about whether AV should be on or not.

-Doug

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Proposal - Continue stmt for PL/pgSQL

2005-06-16 Thread Josh Berkus
Pavel,

  Statement CONTINUE isn't in PL/SQL too, I know it, but Oracle PL/SQL
 has statement GOTO. I don't need GOTO statement, but 'continue' can be
 very usefull for me. I have to do some ugly trick now. With little change,
 we can enhance stmt EXIT for behavior continue.

Can you explain a little better what CONTINUE does that's different from EXIT?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Matthew T. O'Connor

Tom Lane wrote:


Alvaro Herrera [EMAIL PROTECTED] writes:
 


Now, I'm hearing people don't like using libpq.
   



Yeah --- a libpq-based solution is not what I think of as integrated at
all, because it cannot do anything that couldn't be done by the existing
external autovacuum process.  About all you can buy there is having the
postmaster spawn the autovacuum process, which is slightly more
convenient to use but doesn't buy any real new functionality.
 



Yes libpq has to go, I thought this was clear, but perhaps I didn't say 
it clearly enough.  Anyway, this was the stumbling block which prevented 
me from making more progress on autovacuum integration.




Some people say keep it simple and have one process per cluster.  I
think they don't realize it's actually more complex, not the other way
around.
   



A simple approach would be a persistent autovac background process for
each database, but I don't think that's likely to be acceptable because
of the amount of resources tied up (PGPROC slots, open files, etc).
 



Agreed, this seems ugly.


One thing that might work is to have the postmaster spawn an autovac
process every so often.  The first thing the autovac child does is pick
up the current statistics dump file (which it can find without being
connected to any particular database).  It looks through that to
determine which database is most in need of work, then connects to that
database and does some reasonable amount of work there, and finally
quits.  Awhile later the postmaster spawns another autovac process that
can connect to a different database and do work there.
 



I don't think you can use a dump to determine who should be connected to 
next since you don't really know what happened since the last time you 
exited.  What was a priority 5 or 10 minutes ago might not be a priority 
now.



This design would mean that the autovac process could not have any
long-term state of its own: any long-term state would have to be in
either system catalogs or the statistics.  But I don't see that as
a bad thing really --- exposing the state will be helpful from a
debugging and administrative standpoint.



This is not a problem as my patch,  that Alvaro has now taken over, 
already created a new system catalog for all autovac data, so autovac 
really doesn't contain any static persistent data.


The rough design I had in mind was:
1)  On startup postmaster spawns the master autovacuum process
2)  The master autovacuum process spawns backends to do the vacuuming 
work on a particular database
3)  The master autovacuum waits for this process to exit, then spaws the 
next backend for the next database
4)  Repeat this loop until all databases in the cluster have been 
checked, then sleep for a while, and start over again.


I'm not sure if this is feasible, or if this special master autovacuum 
process would be able to fork off or request that the postmaster fork 
off an autovacuum process for a particular database in the cluster.  
Thoughts or comments?


Matthew


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Proposal - Continue stmt for PL/pgSQL

2005-06-16 Thread Bruno Wolff III
On Thu, Jun 16, 2005 at 09:40:16 -0700,
  Josh Berkus josh@agliodbs.com wrote:
 Pavel,
 
   Statement CONTINUE isn't in PL/SQL too, I know it, but Oracle PL/SQL
  has statement GOTO. I don't need GOTO statement, but 'continue' can be
  very usefull for me. I have to do some ugly trick now. With little change,
  we can enhance stmt EXIT for behavior continue.
 
 Can you explain a little better what CONTINUE does that's different from EXIT?

I suspect that CONTINUE is supposed to start the next iteration of the
loop, while EXIT is supposed to terminate the loop.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Tom Lane
Matthew T. O'Connor matthew@zeut.net writes:
 I don't think you can use a dump to determine who should be connected to 
 next since you don't really know what happened since the last time you 
 exited.  What was a priority 5 or 10 minutes ago might not be a priority 
 now.

Well, the information necessary to make that decision has to be
available from the statistics file.  This doesn't seem like an
insuperable problem.

 The rough design I had in mind was:
 1)  On startup postmaster spawns the master autovacuum process
 2)  The master autovacuum process spawns backends to do the vacuuming 
 work on a particular database
 3)  The master autovacuum waits for this process to exit, then spaws the 
 next backend for the next database
 4)  Repeat this loop until all databases in the cluster have been 
 checked, then sleep for a while, and start over again.

This is unworkable, I believe, because backends have to be direct
children of the postmaster.  I don't recall the details at the moment
but there are IPC signaling reasons for it.

 I'm not sure if this is feasible, or if this special master autovacuum 
 process would be able to fork off or request that the postmaster fork 
 off an autovacuum process for a particular database in the cluster.  
 Thoughts or comments?

It's possible that we could add some signaling whereby the autovac
master could request the postmaster to fork a child into a particular
database.  I'm not sure why this is a lot better than keeping the
stats out where everyone can see them...

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Proposal - Continue stmt for PL/pgSQL

2005-06-16 Thread Tom Lane
Pavel Stehule [EMAIL PROTECTED] writes:
   BEGIN
 CONTINUE WHEN i = 10;
 RAISE NOTICE '---1---';
   END;

I find that really ugly and confusing.  If we add a CONTINUE it's only
sensible to allow it inside a loop --- otherwise it's just a nonstandard
spelling of EXIT.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Matthew T. O'Connor

Tom Lane wrote:


Matthew T. O'Connor matthew@zeut.net writes:
 

I don't think you can use a dump to determine who should be connected to 
next since you don't really know what happened since the last time you 
exited.  What was a priority 5 or 10 minutes ago might not be a priority 
now.
   



Well, the information necessary to make that decision has to be
available from the statistics file.  This doesn't seem like an
insuperable problem.
 



Interesting, so the postmaster would kick off an autovacuum process, 
which would read in data from the stats system by hand ( it can do this 
because the stat system writes it's data to flat files?).   I don't know 
how complicated this might be but perhaps a simpler method is to just 
have each autovacuum process write a file for itself noting what 
database it should connect to next.  This would work find assuming we 
want to continue to loop through all the databases in much the same 
fashion as pg_autovacuum currently does.




The rough design I had in mind was:
1)  On startup postmaster spawns the master autovacuum process
2)  The master autovacuum process spawns backends to do the vacuuming 
work on a particular database
3)  The master autovacuum waits for this process to exit, then spaws the 
next backend for the next database
4)  Repeat this loop until all databases in the cluster have been 
checked, then sleep for a while, and start over again.
   


It's possible that we could add some signaling whereby the autovac
master could request the postmaster to fork a child into a particular
database.  I'm not sure why this is a lot better than keeping the
stats out where everyone can see them...



Ok.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Proposal - Continue stmt for PL/pgSQL

2005-06-16 Thread Andrew Dunstan



Pavel Stehule wrote:



What do you think about it? It's broke PL/SQL compatibility, I know, but 
via last discussion I have opinion so Oracle compatibility isn't main 
objective PL/pgSQL. There is some less/bigger diferencess: SQLSTATE, 
EXCEPTION from my last proposal, atd.



 



Well, yes, but I don't think we should break compatibility 
arbitrarilly.  I guess it could be argued that this is a missing feature 
in PL/SQL and its Ada parent -  implementing GOTO just to handle this 
case seems  unnecessary.


I agree with Tom that it should only be allowed inside a loop.

cheers

andrew

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Hannu Krosing


On N, 2005-06-16 at 11:42 -0400, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
...
  Some people say keep it simple and have one process per cluster.  I
  think they don't realize it's actually more complex, not the other way
  around.

 Agreed.  If you aren't connected to a specific database, then you cannot
 use any of the normal backend infrastructure for catalog access, which
 is pretty much a killer limitation.

 A simple approach would be a persistent autovac background process for
 each database, but I don't think that's likely to be acceptable because
 of the amount of resources tied up (PGPROC slots, open files, etc).

In this case it should also be configurable, which databases will get
their own AV processes.

Also, there is probably no need to keep an AV process running very long
after last real backend for that database has closed, as there won't
be any changes anyway.

Having one AV process per DB will likely be a problem for only
installations, where there is very many single-user user-always-
connected databases, which I don't expect to be that many.

And I also expect that soon (after my vacuums-dont-step-on-each-other
patch goes in), there will be need for running several vacuums in
parallel on the same database (say one with non-intrusive vacuum_page
settings for a really large table and several more agressive ones for
fast-changing small tables at the same time), AFAIKS this will also need
several backends - at least one for each parallel vacuum.

 One thing that might work is to have the postmaster spawn an autovac
 process every so often.

my fastest manual vacuum does its job in 5 sec and is repeated at 10 sec
inervals - will this design be able to match this ?

 The first thing the autovac child does is pick
 up the current statistics dump file (which it can find without being
 connected to any particular database).  It looks through that to
 determine which database is most in need of work, then connects to that
 database and does some reasonable amount of work there, and finally
 quits.  Awhile later the postmaster spawns another autovac process that
 can connect to a different database and do work there.

 This design would mean that the autovac process could not have any
 long-term state of its own: any long-term state would have to be in
 either system catalogs or the statistics.  But I don't see that as
 a bad thing really --- exposing the state will be helpful from a
 debugging and administrative standpoint.

--
Hannu Krosing [EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] Proposal - Continue stmt for PL/pgSQL

2005-06-16 Thread Jonah H. Harris

As a near-daily PL/pgSQL developer, I similarly agree.

-Jonah

Andrew Dunstan wrote:




Pavel Stehule wrote:



What do you think about it? It's broke PL/SQL compatibility, I know, 
but via last discussion I have opinion so Oracle compatibility isn't 
main objective PL/pgSQL. There is some less/bigger diferencess: 
SQLSTATE, EXCEPTION from my last proposal, atd.



 



Well, yes, but I don't think we should break compatibility 
arbitrarilly.  I guess it could be argued that this is a missing 
feature in PL/SQL and its Ada parent -  implementing GOTO just to 
handle this case seems  unnecessary.


I agree with Tom that it should only be allowed inside a loop.

cheers

andrew

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] PROPOSAL - User's exception in PL/pgSQL

2005-06-16 Thread Pavel Stehule
On Thu, 16 Jun 2005, Josh Berkus wrote:

 Pavel,
 
  o User can specify SQLSTATE only from class 'U1'
  o Default values for SQLSTATE usr excpt are from class 'U0'
  o Every exception's variable has unique SQLSTATE
  o User's exception or system's exception can be raised only with
   level EXCEPTION
 
  Any comments, notes?
 
 Looks great to me, pending a code examination.   Will it also be possible to 
 query the SQLSTATE/ERRSTRING  inside the EXCEPTION clause?   i.e.
 
 WHEN OTHERS THEN
   RAISE NOTICE '%',sqlstate;
   ROLLBACK;

yes, ofcourse. CVS can it now

 
 That's something missing from 8.0 exception handling that makes it hard to 
 improve SPs with better error messages.
 
 


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Proposal - Continue stmt for PL/pgSQL

2005-06-16 Thread Pavel Stehule
On Thu, 16 Jun 2005, Josh Berkus wrote:

 Pavel,
 
   Statement CONTINUE isn't in PL/SQL too, I know it, but Oracle PL/SQL
  has statement GOTO. I don't need GOTO statement, but 'continue' can be
  very usefull for me. I have to do some ugly trick now. With little change,
  we can enhance stmt EXIT for behavior continue.
 
 Can you explain a little better what CONTINUE does that's different from EXIT?
 

continue is equialent next iteration of cycle. exit break cycle. with 
block stmt? ~ break and continue are equal.

Pavel 


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Proposal - Continue stmt for PL/pgSQL

2005-06-16 Thread Pavel Stehule
On Thu, 16 Jun 2005, Tom Lane wrote:

 Pavel Stehule [EMAIL PROTECTED] writes:
BEGIN
  CONTINUE WHEN i = 10;
  RAISE NOTICE '---1---';
END;
 
 I find that really ugly and confusing.  If we add a CONTINUE it's only
 sensible to allow it inside a loop --- otherwise it's just a nonstandard
 spelling of EXIT.
 

I played too much :-). But, there is something wich can complicate 
implementation, if I disallow it inside block.

for ... LOOP
  begin
continue;
  end
end loop;

if I can use continue in begin and block I have easy rules for 
implementation. I have to first find any outside loop. But I think it's no 
really problem

Pavel Stehule


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Proposal - Continue stmt for PL/pgSQL

2005-06-16 Thread Pavel Stehule
 
 Well, yes, but I don't think we should break compatibility 
 arbitrarilly.  I guess it could be argued that this is a missing feature 
 in PL/SQL and its Ada parent -  implementing GOTO just to handle this 
 case seems  unnecessary.

Yes. I din't use goto 5 years :-). Continue stmt is more cleaner and 
readable.

now:
  
   FOR i IN 1 .. 100 LOOP
 continue := true
 WHILE continue LOOP
   ...
   EXIT; -- contine
   continue := false; -- really exit
 END LOOP;
   END LOOP;

with continue

  FOR i IN 1 .. 100 LOOP
...
EXIT WHEN ..
CONTINUE WHEN ..
  END LOOP;

One argument for continue inside begin block - for discussion only.
on loop exit means break iteration, continue new iteration. Continue and 
Exit are symmetric.

I didn't know ADA haven't continue. In PL/pgSQL there isn't any problem 
implement continue stmt (wit any face), but goto stmt means relative big 
changes in source code.

Pavel  

 

 
 I agree with Tom that it should only be allowed inside a loop.
 
 cheers
 
 andrew
 
 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings
 


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Utility database (Was: RE: Autovacuum in the backend)

2005-06-16 Thread Dave Page
 

 -Original Message-
 From: Josh Berkus [mailto:[EMAIL PROTECTED] 
 Sent: 16 June 2005 17:29
 To: Dave Page
 Cc: pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Autovacuum in the backend
 
 Dave,
 
  In previous discussions on -hackers when ppl raised the idea of
  something like pgAgent being built into the backend, istm that the
  majority of people were against the idea.
 
 Well, you're up against the minimalist approach to core 
 PostgreSQL there.  It 
 would pretty much *have* to be an optional add-in, even if it 
 was stored in 
 pg_catalog.  I can see a lot of uses for a back-end job 
 scheduler myself, but 
 it would need to go through the gauntlet of design criticism 
 first wry 
 grin.

And as we all know, optional means pgFoundry or someplace else. To be
honest, I simply couldn't be bothered on this one because even if I
could convince everyone to allow such a beast on the backend, the
arguments about how it should work would probably go on forever.
Consequently it's well and truly part of pgAdmin now :-).

One related idea that I have been meaning to moot for a while now
though, is that of a 'utility' database. One of the problems we've
always had in pgAdmin (and presumably phpPgAdmin as well), is that the
only database we know exists with any reasonable surety is template1,
and consequently, this is the default database that pgAdmin connects to.
There are obvious problems with this - in particular:

- Newbies may not realise the significance of making their initial
experiments in template1
- Administrators may not want users connecting to template1
- We don't want to create utility objects in template1 to offer enhanced
functionality in the client.

To overcome this, a alternative database created by initdb would be very
useful. This would be roughly the equivalent of SQL Server's 'msdb'
database and would allow:

- A default non-template database for apps to connect to initially
- A standard place for apps like pgAgent to store their cluster-specific
configuration  data
- A standard place for apps like pgAdmin to store utility objects

What are peoples thoughts on this?

Regards, Dave.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Andreas Pflug

Josh Berkus wrote:

Dave,



In previous discussions on -hackers when ppl raised the idea of
something like pgAgent being built into the backend, istm that the
majority of people were against the idea.



Well, you're up against the minimalist approach to core PostgreSQL there.  It 
would pretty much *have* to be an optional add-in, even if it was stored in 
pg_catalog.  I can see a lot of uses for a back-end job scheduler myself, but 
it would need to go through the gauntlet of design criticism first wry 
grin.


You want to scare me, don't you? :-)

We're having a growing zoo of daemons that can be regarded as tightly 
integrated server add-on processes (slony, autovac, pgAgent), and it 
would be really nice (say: win32 users are used to it, thus requiring 
it) to have a single point of control.


Maybe a super daemon (in win32 probably pg_ctl), controlling postmaster 
and all those helper processes (accessible through pgsql functions, of 
course) would be the solition. This keeps the kernel clean, separates 
backend shmem from helper processes and enables control over all processes.


Regards,
Andreas

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] Spikesource now doing automated testing, sponsorship of PostgreSQL

2005-06-16 Thread Josh Berkus
Hackers,

SpikeSource is now doing automated testing of the PostgreSQL code in their 
stack testing platform.  This includes:

*   SpikeSource has incorporated the testing of PostgreSQL into
the company's 22,000 nightly automated test runs. SpikeSource includes
the code coverage of PostgreSQL, as well as Postgres JDBC drivers and
the phpPgAdmin tool into this environment.

*   Sponsorship of Christopher Kings-Lynne in adding Slony-I management 
tools to phpPgAdmin

*   Sponsorship of me writing migration tools and documentation for 
PostgreSQL.

You can see the component tests here: 
http://www.spikesource.com/spikewatch/index.jsp

And the PG information page here:
http://www.spikesource.com/info/summary.php?c=POSTGRESQL

What this all means is that SpikeSource has started the process of building 
and testing PostgreSQL with numerous popular components (they still need 
to add a lot).  This should supplement pgBuildfarm and limit future 
accidental plug-in breakage.


-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Andrew Dunstan



Andreas Pflug wrote:



We're having a growing zoo of daemons that can be regarded as tightly 
integrated server add-on processes (slony, autovac, pgAgent), and it 
would be really nice (say: win32 users are used to it, thus requiring 
it) to have a single point of control.


Maybe a super daemon (in win32 probably pg_ctl), controlling 
postmaster and all those helper processes (accessible through pgsql 
functions, of course) would be the solition. This keeps the kernel 
clean, separates backend shmem from helper processes and enables 
control over all processes.




And this will be ready when? I thought we were discussing what could be 
done regarding AVitB between now and feature freeze for 8.1 in about 2 
weeks. This surely doesn't come into that category.


cheers

andrew

---(end of broadcast)---
TIP 6: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Gavin Sherry
On Thu, 16 Jun 2005, Alvaro Herrera wrote:

 On Thu, Jun 16, 2005 at 04:20:34PM +1000, Gavin Sherry wrote:

  2) By no fault of its own, autovacuum's level of granularity is the table
  level. For people dealing with non-trivial amounts of data (and we're not
  talking gigabytes or terabytes here), this is a serious drawback. Vacuum
  at peak times can cause very intense IO bursts -- even with the
  enhancements in 8.0. I don't think the solution to the problem is to give
  users the impression that it is solved and then vacuum their tables during
  peak periods. I cannot stress this enough.

 People running systems with petabyte-sized tables can disable autovacuum
 for those tables, and leave it running for the rest.  Then they can
 schedule whatever maintenance they see fit on their gigantic tables.
 Trying to run a database with more than a dozen gigabytes of data
 without expert advice (or at least reading the manual) would be
 extremely stupid anyway.

As I've said a few times, I'm not concerned about such users. I'm
concerned about users with some busy tables of a few hundred megabytes. I
still don't think VACUUM at arbitary times on such tables is suitable.

Thanks,

Gavin

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Andreas Pflug

Andrew Dunstan wrote:




Andreas Pflug wrote:



We're having a growing zoo of daemons that can be regarded as tightly 
integrated server add-on processes (slony, autovac, pgAgent), and it 
would be really nice (say: win32 users are used to it, thus requiring 
it) to have a single point of control.


Maybe a super daemon (in win32 probably pg_ctl), controlling 
postmaster and all those helper processes (accessible through pgsql 
functions, of course) would be the solition. This keeps the kernel 
clean, separates backend shmem from helper processes and enables 
control over all processes.




And this will be ready when? 


This is certainly 8.2 stuff. I'm sufficiently glad if the 
instrumentation stuff that was posted pre-8.0 and left out those days 
because committers ran out of time makes it into 8.1...


I thought we were discussing what could be done regarding AVitB 
between now and feature freeze for 8.1 in about 2 weeks. This surely 
doesn't come into that category.


I agree with former posters that we should have a default on AV to have 
a system that performs correct out of the box for smaller installations. 
Even a functionally cut-down version of AV running by default that has 
to be stopped and replaced by a more sophisticated solution for high 
performance installations is better than now.


Regards,
Andreas


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] MemoryContextAlloc: invalid request size

2005-06-16 Thread Brusser, Michael
Title: MemoryContextAlloc: invalid request size 





Our customer is reporting a database problem after they ran into a 
disk quota/filesystem full situation.


This is Postgres 7.2.x on Solaris.
The database server starts without any obvious errors, but the app. Server
cannot establish connection. (It is setup to use Unix Domain Socket)

I did not have a chance to see the database-log, but we have the error-log and 
the truss log from the App. Server.


In the error-log I see this:
 ... ...
 Connection to database failed
 FATAL: MemoryContextAlloc: invalid request size 0



And this in the trace log:
... ...
10894: open(/usr/lib/libresolv.so.2, O_RDONLY) = 11
10894: fstat(11, 0xFFBE9004) = 0
10894: mmap(0xFE02, 8192, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 11, 0) = 0xFE02
10894: mmap(0x, 303104, PROT_READ|PROT_EXEC, MAP_PRIVATE, 11, 0) = 0xFDD2
10894: mmap(0xFDD64000, 15564, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 11, 212992) = 0xFDD64000
10894: mmap(0xFDD68000, 2728, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANON, -1, 0) = 0xFDD68000
10894: munmap(0xFDD54000, 65536) = 0
10894: memcntl(0xFDD2, 33536, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
10894: close(11) = 0
10894: munmap(0xFE02, 8192) = 0
10894: so_socket(1, 2, 0, , 1) = 11
10894: fstat64(11, 0xFFBE8A70) = 0
10894: getsockopt(11, 65535, 8192, 0xFFBE8B70, 0xFFBE8B6C, 0) = 0
10894: setsockopt(11, 65535, 8192, 0xFFBE8B70, 4, 0) = 0
10894: fcntl(11, F_SETFL, 0x0080) = 0
10894: connect(11, 0x0060EAA0, 77, 1) = 0
10894: poll(0xFFBE89E8, 1, -1) = 1
10894: sigaction(SIGPIPE, 0xFFBE8788, 0xFFBE) = 0
10894: send(11, \0\001 (\002\0\0 s y n c.., 296, 0) = 296
10894: sigaction(SIGPIPE, 0xFFBE8788, 0xFFBE) = 0
10894: poll(0xFFBE89E8, 1, -1) = 1
10894: recv(11,  R\0\0\0\0 E F A T A L :.., 16384, 0) = 58
... ...


Could you recommend the remedy?


Thanks in advance,
Mike.





[HACKERS] DTrace Probes?

2005-06-16 Thread Josh Berkus
Hey, Folks,

I need to find someone who's really interesed in working with DTrace.  Sun 
has offered to help put DTrace probes into PostgreSQL for advanced 
profiling, but need to know where to probe.   Anyone?

I'm afraid that I won't get around to this quickly enough.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Tim Allen

Josh Berkus wrote:

Alvaro,



coffee-with-cream vacuums.


I tried this and now my Hoover makes this horrible noise and smokes.  ;-)


Probably related to the quality of American coffee ;).


All:

Seriously, all:  when I said that users were asking for Autovac in the 
backend (AVitB), I wasn't talking just the newbies on #postgresql.   I'm also 
talking companies like Hyperic, and whole groups like the postgresql.org.br.   
This is a feature that people want, and unless there's something 
fundamentally unstable about it, it seems really stupid to hold it back 
because we're planning VACUUM improvements for 8.2.


AVitB has been on the TODO list for 2 versions.   There's been 2 years to 
question its position there.   Now people are bringing up objections when 
there's no time for discussion left?  This stinks.


Complete agreement from me. Incremental improvements are good - pointing 
out that there are some other incremental improvements that would also 
be good to make is not an argument for delaying the first set of 
incremental improvements.


In our case, we want to be able to install postgres at dozens (ideally 
hundreds... no, thousands :) ) of customer sites, where the customers in 
general are not going to have anyone onsite who has a clue about 
postgres. The existing contrib autovacuum gives a good solution to 
setting things up to maintain the database in a reasonable state of 
health without need for further intervention from us. It's not perfect, 
of course, but if it means the difference between having to unleash our 
support team on a customer once a month and once a year, that's a good 
deal for us. Having it integrated into the backend will make it much 
easier for us, we (hopefully...) won't have to fiddle with extra startup 
scripts, and we'll have one fewer point of failure (eg some customer 
might accidentally turn off the separate pg_autovacuum daemon). Being 
able to customise the autovacuum parameters on a per-table basis is also 
attractive.


Just my AUD0.02. I realise that keeping _our_ customers happy is not 
necessarily anyone else's priority. I'd like to be able to invest some 
coding time, but can't. I haven't even gotten around to completing 
Gavin's survey form (sorry Gav, I'll get to it soon, I hope! :)), so I 
can't demand to be listened to.


But for what it's worth, Alvaro, please keep going, don't be dissuaded.

Tim

--
---
Tim Allen  [EMAIL PROTECTED]
Proximity Pty Ltd  http://www.proximity.com.au/

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Escape handling in strings

2005-06-16 Thread Bruce Momjian
Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
  All true. Conversely, there does need to be a path for us to get to 
  standard behaviour.
 
 Yes --- but the important word there is path.  I think we have to do
 this in stages over a number of releases, to give people time to
 migrate.
 
 Assuming that the end result we want to get to is:
   1. Plain '...' literals are per SQL spec: '' for embedded
  quotes, backslashes are not special.
   2. We add a construct E'...' that handles backslash escapes
  the same way '...' literals do today.
 
 I think what would be reasonable for 8.1 is to create the E'...'
 construct --- which will not cause any backwards compatibility issues
 that I can see --- document it and encourage people to migrate,
 and start throwing warnings about use of \' in non-E literals.
 (We could have a GUC variable to suppress the warnings; I'm of
 the opinion that it would be better not to, though, because the point
 is to get people out of that habit sooner rather than later.)

OK, the current patch warns about two things, \' with one message, and
any backslash in a non-E string with a different message.  The \'
message can easily be avoided in clients even in 8.0 by using '', but
for E'', there is no way to prepare an application before upgrading to
8.1 because 8.0 doesn't have E''.  (We can add E'' in a subrelease, but
what percentage of users are going to upgrade to that?)  This is why I
think we need to add a GUC to allow the warning to be turned off.  To be
clear, the GUC is to control the warning, not the query behavior.

We could go with the second warning only in 8.2, but that seems too
confusing --- we should deal with the escape issue in two stages, rather
than three.

 The hard part in all this is to create apps that will survive the
 transition gracefully.  I think the only way for that is to implement
 a reporting feature that lets the app know whether backslahes are
 special in plain literals or not.  We already have the mechanism for
 that, ie read-only GUC variables with GUC_REPORT enabled (which we use
 for integer datetimes, for instance).  But I really believe it is
 important that this be a *read only* thing not something that can be
 flipped around at runtime.  Anyway, the reporting variable is another
 thing that should appear in 8.1.

OK, adding.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] [PATCHES] Escape handling in strings

2005-06-16 Thread Andrew Dunstan
Bruce Momjian said:

 OK, the current patch warns about two things, \' with one message, and
 any backslash in a non-E string with a different message.  The \'
 message can easily be avoided in clients even in 8.0 by using '', but
 for E'', there is no way to prepare an application before upgrading to
 8.1 because 8.0 doesn't have E''.  (We can add E'' in a subrelease, but
 what percentage of users are going to upgrade to that?)  This is why I
 think we need to add a GUC to allow the warning to be turned off.  To
 be clear, the GUC is to control the warning, not the query behavior.

 We could go with the second warning only in 8.2, but that seems too
 confusing --- we should deal with the escape issue in two stages,
 rather than three.


So you don't agree with Tom's suggestion to implement E'' a full cycle
before removing backslash processing in standard strings? Or have I
misunderstood again?

cheers

andrew



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] Escape handling in strings

2005-06-16 Thread Bruce Momjian
Andrew Dunstan wrote:
 Bruce Momjian said:
 
  OK, the current patch warns about two things, \' with one message, and
  any backslash in a non-E string with a different message.  The \'
  message can easily be avoided in clients even in 8.0 by using '', but
  for E'', there is no way to prepare an application before upgrading to
  8.1 because 8.0 doesn't have E''.  (We can add E'' in a subrelease, but
  what percentage of users are going to upgrade to that?)  This is why I
  think we need to add a GUC to allow the warning to be turned off.  To
  be clear, the GUC is to control the warning, not the query behavior.

  ^
 
  We could go with the second warning only in 8.2, but that seems too
  confusing --- we should deal with the escape issue in two stages,
  rather than three.
 
 
 So you don't agree with Tom's suggestion to implement E'' a full cycle
 before removing backslash processing in standard strings? Or have I
 misunderstood again?

I think you misunderstood.  There is no scheduled date to change the
actual behavior.  The issue is whether we delay one release before
issuing a warning for backslashes in non-E strings.

I have highlighted the sentence where I say we are talking about when to
add the warning, not when to change the behavior.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Qingqing Zhou

Tom Lane [EMAIL PROTECTED] writes:

 Yeah --- a libpq-based solution is not what I think of as integrated at
 all, because it cannot do anything that couldn't be done by the existing
 external autovacuum process.  About all you can buy there is having the
 postmaster spawn the autovacuum process, which is slightly more
 convenient to use but doesn't buy any real new functionality.


One reason of not using lib-pq is that this one has to wait for the
completion of each vacuum (we don't has async execution in libpq right?),
but by signaling does not.

But by signaling, we have to detect that if the forked backend successfully
done its job. I am not sure how to easily incorporate this into current
signaling framework.

Regards,
Qingqing



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Bruce Momjian
Gavin Sherry wrote:
 In January I was in Toronto with Jan, Tom and others and some ideas about
 vacuum were being discussed. The basic idea is that when we dirty pages we
 need we set a bit in a bitmap to say that the page has been dirty. A
 convenient place to do this is when we are writing dirty buffers out to
 disk. In many situations, this can happen inside the bgwriter meaning that
 there should be little contention for this bitmap. Of course, individual
 backends may be writing pages out and would have to account for the
 dirty pages at that point.
 
 Now this bitmap can occur on a per heap segment basis (ie, per 1 GB heap
 file). You only need 2 pages for the bitmap to represent all the pages in
 the segment, which is fairly nice. When vacuum is run, instead of visiting
 every page, it would see which pages have been dirtied in the bitmap and
 visit only pages. With large tables and small numbers of modified
 tuples/pages, the effect this change would have would be pretty
 impressive.

Added to TODO:

* Create a bitmap of pages that need vacuuming

  Instead of sequentially scanning the entire table, have the background
  writer or some other process record pages that have expired rows, then
  VACUUM can look at just those pages rather than the entire table.  In
  the event of a system crash, the bitmap would probably be invalidated.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] [PATCHES] Escape handling in strings

2005-06-16 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes:
 OK, the current patch warns about two things, \' with one message, and
 any backslash in a non-E string with a different message.

Those are two very different things.  \' is easy to get around and
there's no very good reason not to send '' instead.  But avoiding all
use of \anything is impossible (think \\) so a non-suppressable warning
for that would be quite unacceptable IMHO.  I think it's much too early
to be throwing a warning for \anything anyway.  8.2 or so, OK, but not
in this cycle.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] [PATCHES] Escape handling in strings

2005-06-16 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
  OK, the current patch warns about two things, \' with one message, and
  any backslash in a non-E string with a different message.
 
 Those are two very different things.  \' is easy to get around and
 there's no very good reason not to send '' instead.  But avoiding all
 use of \anything is impossible (think \\) so a non-suppressable warning
 for that would be quite unacceptable IMHO.  I think it's much too early
 to be throwing a warning for \anything anyway.  8.2 or so, OK, but not
 in this cycle.

I am concerned we are going to generate confusing if we warn about one
use of backslashes in strings but not another.  I am thinking we will
just add the infrastructure for E'' in 8.1 (with the warning turned
off), and state we will warn about all backslashes in non-E strings in
8.2, and maybe go for literal strings in 8.3 or 8.4 depending on user
feedback.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Autovacuum in the backend

2005-06-16 Thread Thomas F. O'Connell

Gavin,

For the record, I don't consider myself a PostgreSQL newbie, nor do I  
manage any 2 TB databases (much less tables), but I do have an  
unusual production use case: thousands ( 10,000) of tables, many of  
them inherited, and many of them with hundreds of thousands (a few  
with millions) of rows.


Honestly, creating crontab vacuum management for this scenario would  
be a nightmare, and pg_autovacuum has been a godsend. Considering the  
recent revelations of O(n^2) iterations over table lists in the  
current versions and the stated and apparent ease with which this  
problem could be solved by integrating the basic functionality of  
pg_autovacuum into the backend, I can personally attest to there  
being real-world use cases that would benefit tremendously from  
integrated autovacuum.


A few months ago, I attempted to solve the wrong problem by  
converting a hardcoded threshold into another command-line option. If  
I had spotted the O(n^2) problem, I might've spent the time working  
on it then instead of the new command-line option. I suppose it's  
possible that I'll head down this road anyway if it looks like  
integrated pg_autovacuum is going to be put on hold indefinitely  
after this discussion.


Anyway, just wanted to throw out some food for thought for the  
practicality of a tool like pg_autovacuum.


--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC

Strategic Open Source: Open Your i

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

On Jun 16, 2005, at 5:22 PM, Gavin Sherry wrote:


On Thu, 16 Jun 2005, Alvaro Herrera wrote:



On Thu, Jun 16, 2005 at 04:20:34PM +1000, Gavin Sherry wrote:


2) By no fault of its own, autovacuum's level of granularity is  
the table
level. For people dealing with non-trivial amounts of data (and  
we're not
talking gigabytes or terabytes here), this is a serious drawback.  
Vacuum

at peak times can cause very intense IO bursts -- even with the
enhancements in 8.0. I don't think the solution to the problem is  
to give
users the impression that it is solved and then vacuum their  
tables during

peak periods. I cannot stress this enough.



People running systems with petabyte-sized tables can disable  
autovacuum

for those tables, and leave it running for the rest.  Then they can
schedule whatever maintenance they see fit on their gigantic tables.
Trying to run a database with more than a dozen gigabytes of data
without expert advice (or at least reading the manual) would be
extremely stupid anyway.



As I've said a few times, I'm not concerned about such users. I'm
concerned about users with some busy tables of a few hundred  
megabytes. I

still don't think VACUUM at arbitary times on such tables is suitable.

Thanks,

Gavin


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] [PATCHES] Escape handling in strings

2005-06-16 Thread Michael Glaesemann


On Jun 17, 2005, at 12:33 PM, Tom Lane wrote:


Bruce Momjian pgman@candle.pha.pa.us writes:

OK, the current patch warns about two things, \' with one message,  
and

any backslash in a non-E string with a different message.



Those are two very different things.  \' is easy to get around and
there's no very good reason not to send '' instead.  But avoiding all
use of \anything is impossible (think \\) so a non-suppressable  
warning
for that would be quite unacceptable IMHO.  I think it's much too  
early

to be throwing a warning for \anything anyway.  8.2 or so, OK, but not
in this cycle.


I think giving users a longer period of time to make the necessary  
changes to their apps is very useful. If (as I understand) we're  
giving them the opportunity to use E'' strings if they want to  
continue to use \ for escaping, they can get rid of the warnings now,  
by using E'' strings or using '' to escape. Getting  people to  
migrate something such as this is difficult and will take them quite  
a while, I imagine. Giving them a longer time to change their  
behavior as well as reinforcing it with a warning is helpful. They  
can also easily check if they've got places they've missed in  
changing their code, because the warnings will be prominent in their  
logs.


Michael Glaesemann
grzm myrealbox com



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster