Re: [HACKERS] autovacuum does not start in HEAD

2007-05-07 Thread Alvaro Herrera
ITAGAKI Takahiro wrote:
 Alvaro Herrera [EMAIL PROTECTED] wrote:
 
  ITAGAKI Takahiro wrote:
I found that autovacuum launcher does not launch any workers in HEAD.
   
   The attached autovacuum-fix.patch could fix the problem. I changed
   to use 'greater or equal' instead of 'greater' at the decision of
   next autovacuum target.
  
  I have committed a patch which might fix this issue in autovacuum.c rev 
  1.44.
  Please retest.
 
 HEAD (r1.45) is still broken. We skip entries using the test
   adl_next_worker - autovacuum_naptime  current_time = adl_next_worker,
 but the second inequation should be
   adl_next_worker - autovacuum_naptime  current_time  adl_next_worker,
 because adl_next_worker can equal current_time.

Ok, I'll change this.

 By the way, why do we need the upper bounds to decide a next target?
 Can we use simplify it to current_time  adl_next_worker?

No, we can't take that check out, because otherwise a database could be
skipped forever if it happens to fall behind for some reason (for
example when a new database is created and autovac decides to work on
that one instead of the one that was scheduled).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] autovacuum does not start in HEAD

2007-05-06 Thread ITAGAKI Takahiro
Alvaro Herrera [EMAIL PROTECTED] wrote:

 ITAGAKI Takahiro wrote:
   I found that autovacuum launcher does not launch any workers in HEAD.
  
  The attached autovacuum-fix.patch could fix the problem. I changed
  to use 'greater or equal' instead of 'greater' at the decision of
  next autovacuum target.
 
 I have committed a patch which might fix this issue in autovacuum.c rev 1.44.
 Please retest.

HEAD (r1.45) is still broken. We skip entries using the test
  adl_next_worker - autovacuum_naptime  current_time = adl_next_worker,
but the second inequation should be
  adl_next_worker - autovacuum_naptime  current_time  adl_next_worker,
because adl_next_worker can equal current_time.

@@ -1036,8 +1036,8 @@
 * Skip this database if its next_worker value 
falls between
 * the current time and the current time plus 
naptime.
 */
-   if (TimestampDifferenceExceeds(current_time,
-   
   dbp-adl_next_worker, 0) 
+   if 
(!TimestampDifferenceExceeds(dbp-adl_next_worker,
+   
current_time, 0) 

!TimestampDifferenceExceeds(current_time,

dbp-adl_next_worker,

autovacuum_naptime * 1000))

By the way, why do we need the upper bounds to decide a next target?
Can we use simplify it to current_time  adl_next_worker?

@@ -1033,16 +1033,11 @@
if (dbp-adl_datid == tmp-adw_datid)
{
/*
-* Skip this database if its next_worker value 
falls between
-* the current time and the current time plus 
naptime.
+* Skip this database if its next_worker value 
is later than
+* the current time.
 */
-   if (TimestampDifferenceExceeds(current_time,
-   
   dbp-adl_next_worker, 0) 
-   
!TimestampDifferenceExceeds(current_time,
-   
dbp-adl_next_worker,
-   
autovacuum_naptime * 1000))
-   skipit = true;
-
+   skipit = 
!TimestampDifferenceExceeds(dbp-adl_next_worker,
+   
 current_time, 0);
break;
}
elem = DLGetPred(elem);

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] autovacuum does not start in HEAD

2007-05-02 Thread Alvaro Herrera
ITAGAKI Takahiro wrote:
 I wrote:
  I found that autovacuum launcher does not launch any workers in HEAD.
 
 The attached autovacuum-fix.patch could fix the problem. I changed
 to use 'greater or equal' instead of 'greater' at the decision of
 next autovacuum target.

I have committed a patch which might fix this issue in autovacuum.c rev 1.44.
Please retest.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] autovacuum does not start in HEAD

2007-05-01 Thread Alvaro Herrera
ITAGAKI Takahiro wrote:
 I wrote:
  I found that autovacuum launcher does not launch any workers in HEAD.
 
 The attached autovacuum-fix.patch could fix the problem. I changed
 to use 'greater or equal' instead of 'greater' at the decision of
 next autovacuum target.

I developed a different fix, which is possible due to the addition of
TimestampDifferenceExceeds to the TimestampTz API.  (Thanks Tom).

It continues to work for me here, but please confirm that it fixes the
bug you reported -- I don't have a low-resolution platform handy.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Index: src/backend/postmaster/autovacuum.c
===
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/postmaster/autovacuum.c,v
retrieving revision 1.42
diff -c -p -r1.42 autovacuum.c
*** src/backend/postmaster/autovacuum.c	18 Apr 2007 16:44:18 -	1.42
--- src/backend/postmaster/autovacuum.c	2 May 2007 01:53:02 -
*** AutoVacLauncherMain(int argc, char *argv
*** 549,556 
  
  		if (can_launch  AutoVacuumShmem-av_startingWorker != INVALID_OFFSET)
  		{
- 			long	secs;
- 			int		usecs;
  			WorkerInfo worker = (WorkerInfo) MAKE_PTR(AutoVacuumShmem-av_startingWorker);
  
  			if (current_time == 0)
--- 549,554 
*** AutoVacLauncherMain(int argc, char *argv
*** 566,576 
  			 * startingWorker pointer before trying to connect; only low-level
  			 * problems, like fork() failure, can get us here.
  			 */
! 			TimestampDifference(worker-wi_launchtime, current_time,
! secs, usecs);
! 
! 			/* ignore microseconds, as they cannot make any difference */
! 			if (secs  autovacuum_naptime)
  			{
  LWLockRelease(AutovacuumLock);
  LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
--- 564,571 
  			 * startingWorker pointer before trying to connect; only low-level
  			 * problems, like fork() failure, can get us here.
  			 */
! 			if (TimestampDifferenceExceeds(worker-wi_launchtime, current_time,
! 		   autovacuum_naptime * 1000))
  			{
  LWLockRelease(AutovacuumLock);
  LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
*** AutoVacLauncherMain(int argc, char *argv
*** 618,630 
  			if (elem != NULL)
  			{
  avl_dbase *avdb = DLE_VAL(elem);
- long	secs;
- int		usecs;
- 
- TimestampDifference(current_time, avdb-adl_next_worker, secs, usecs);
  
! /* do we have to start a worker? */
! if (secs = 0  usecs = 0)
  	launch_worker(current_time);
  			}
  			else
--- 613,625 
  			if (elem != NULL)
  			{
  avl_dbase *avdb = DLE_VAL(elem);
  
! /*
!  * launch a worker if next_worker is right now or it is in the
!  * past
!  */
! if (TimestampDifferenceExceeds(avdb-adl_next_worker,
! 			   current_time, 0))
  	launch_worker(current_time);
  			}
  			else
*** do_start_worker(void)
*** 1037,1058 
  
  			if (dbp-adl_datid == tmp-adw_datid)
  			{
- TimestampTz		curr_plus_naptime;
- TimestampTz		next = dbp-adl_next_worker;
- 
- curr_plus_naptime =
- 	TimestampTzPlusMilliseconds(current_time,
- autovacuum_naptime * 1000);
- 
  /*
!  * What we want here if to skip if next_worker falls between
   * the current time and the current time plus naptime.
   */
! if (timestamp_cmp_internal(current_time, next)  0)
! 	skipit = false;
! else if (timestamp_cmp_internal(next, curr_plus_naptime)  0)
! 	skipit = false;
! else
  	skipit = true;
  
  break;
--- 1032,1046 
  
  			if (dbp-adl_datid == tmp-adw_datid)
  			{
  /*
!  * Skip this database if its next_worker value falls between
   * the current time and the current time plus naptime.
   */
! if (TimestampDifferenceExceeds(current_time,
! 			   dbp-adl_next_worker, 0) 
! 	!TimestampDifferenceExceeds(current_time,
! dbp-adl_next_worker,
! autovacuum_naptime * 1000))
  	skipit = true;
  
  break;

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [HACKERS] autovacuum does not start in HEAD

2007-04-26 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


ITAGAKI Takahiro wrote:
 I wrote:
  I found that autovacuum launcher does not launch any workers in HEAD.
 
 The attached autovacuum-fix.patch could fix the problem. I changed
 to use 'greater or equal' instead of 'greater' at the decision of
 next autovacuum target.
 
 The point was in the resolution of timer; There is a platform that timer
 has only a resolution of milliseconds. We initialize adl_next_worker with
 current_time in rebuild_database_list(), but we could use again the same
 value in do_start_worker(), because there is no measurable difference
 in those low-resolution-platforms.
 
 
 Another attached patch, autovacuum-debug.patch, is just for printf-debug.
 I got the following logs without fix -- autovacuum never works.
 
 # SELECT oid, datname FROM pg_database ORDER BY oid;
   oid  |  datname  
 ---+---
  1 | template1
  11494 | template0
  11495 | postgres
  16384 | bench
 (4 rows)
 
 # pgbench bench -s1 -c1 -t10
 [with configurations of autovacuum_naptime = 10s and log_min_messages = 
 debug1]
 
 LOG:  do_start_worker skip : 230863399.25, 230863399.25, 
 230863409.25
 LOG:  rebuild_database_list: db=11495, time=230863404.25
 LOG:  rebuild_database_list: db=16384, time=230863409.25
 DEBUG:  autovacuum: processing database bench
 LOG:  do_start_worker skip : 230863404.25, 230863404.25, 
 230863414.25
 LOG:  do_start_worker skip : 230863404.25, 230863409.25, 
 230863414.25
 LOG:  rebuild_database_list: db=11495, time=230863409.25
 LOG:  rebuild_database_list: db=16384, time=230863414.25
 LOG:  do_start_worker skip : 230863409.25, 230863409.25, 
 230863419.25
 LOG:  do_start_worker skip : 230863409.25, 230863414.25, 
 230863419.25
 LOG:  rebuild_database_list: db=11495, time=230863414.25
 LOG:  rebuild_database_list: db=16384, time=230863419.25
 ...
 (no autovacuum activities forever)
 
 Regards,
 ---
 ITAGAKI Takahiro
 NTT Open Source Software Center
 

[ Attachment, skipping... ]

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] autovacuum does not start in HEAD

2007-04-25 Thread ITAGAKI Takahiro
I found that autovacuum launcher does not launch any workers in HEAD.

AFAICS, we track the time to be vaccumed of each database in the following way:

1. In rebuild_database_list(), we initialize avl_dbase-adl_next_worker
   with (current_time + autovacuum_naptime / nDBs).
2. In do_start_worker(), we skip database entries that adl_next_worker
   is between current_time and current_time + autovacuum_naptime.
3. If there is no jobs in do_start_worker(), we call rebuild_database_list()
   to rebuild database entries.

The point is we use the same range (current_time and current_time +
autovacuum_naptime) at 1 and 2. We set adl_next_worker with values in the
range, and drop all of them at 2 because their values are in the range.
And if there is no database to vacuum, we re-initilaize database list at 3,
then we repeat the cycle.

Or am I missing something?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] autovacuum does not start in HEAD

2007-04-25 Thread Alvaro Herrera
ITAGAKI Takahiro wrote:
 I found that autovacuum launcher does not launch any workers in HEAD.
 
 AFAICS, we track the time to be vaccumed of each database in the following 
 way:
 
 1. In rebuild_database_list(), we initialize avl_dbase-adl_next_worker
with (current_time + autovacuum_naptime / nDBs).
 2. In do_start_worker(), we skip database entries that adl_next_worker
is between current_time and current_time + autovacuum_naptime.
 3. If there is no jobs in do_start_worker(), we call rebuild_database_list()
to rebuild database entries.
 
 The point is we use the same range (current_time and current_time +
 autovacuum_naptime) at 1 and 2. We set adl_next_worker with values in the
 range, and drop all of them at 2 because their values are in the range.
 And if there is no database to vacuum, we re-initilaize database list at 3,
 then we repeat the cycle.
 
 Or am I missing something?

Note that rebuild_database_list skips databases that don't have stat
entries.  Maybe that's what confusing your examination.  When the list
is empty, worker are launched only every naptime seconds; and then it'll
also pick only databases with stat entries.  All other databases will be
skipped until the max_freeze_age is reached.  Right after an initdb or a
WAL replay, all database stats are deleted.

The point of (1) is to spread the starting of workers in the
autovacuum_naptime interval.

The point of (2) is that we don't want to process a database that was
processed too recently (less than autovacuum_naptime seconds ago).  This
is useful in the cases where databases are dropped, so the launcher is
awakened earlier than what the schedule would say if the dropped
database were not in the list.  It is possible that I confused the
arithmetic in there (because TimestampDifference does not return
negative results so there may be strange corner cases), but the last
time I examined it it was correct.

The point of (3) is to cover the case where there were no databases
being previously autovacuumed and that may now need vacuuming (i.e. just
after a database got its stat entry).

The fact that some databases may not have stat entries tends to confuse
the logic, both in rebuild_database_list and do_start_worker.  If it's
not documented enough maybe it needs extra clarification in code
comments.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] autovacuum does not start in HEAD

2007-04-25 Thread ITAGAKI Takahiro
I wrote:
 I found that autovacuum launcher does not launch any workers in HEAD.

The attached autovacuum-fix.patch could fix the problem. I changed
to use 'greater or equal' instead of 'greater' at the decision of
next autovacuum target.

The point was in the resolution of timer; There is a platform that timer
has only a resolution of milliseconds. We initialize adl_next_worker with
current_time in rebuild_database_list(), but we could use again the same
value in do_start_worker(), because there is no measurable difference
in those low-resolution-platforms.


Another attached patch, autovacuum-debug.patch, is just for printf-debug.
I got the following logs without fix -- autovacuum never works.

# SELECT oid, datname FROM pg_database ORDER BY oid;
  oid  |  datname  
---+---
 1 | template1
 11494 | template0
 11495 | postgres
 16384 | bench
(4 rows)

# pgbench bench -s1 -c1 -t10
[with configurations of autovacuum_naptime = 10s and log_min_messages = debug1]

LOG:  do_start_worker skip : 230863399.25, 230863399.25, 
230863409.25
LOG:  rebuild_database_list: db=11495, time=230863404.25
LOG:  rebuild_database_list: db=16384, time=230863409.25
DEBUG:  autovacuum: processing database bench
LOG:  do_start_worker skip : 230863404.25, 230863404.25, 
230863414.25
LOG:  do_start_worker skip : 230863404.25, 230863409.25, 
230863414.25
LOG:  rebuild_database_list: db=11495, time=230863409.25
LOG:  rebuild_database_list: db=16384, time=230863414.25
LOG:  do_start_worker skip : 230863409.25, 230863409.25, 
230863419.25
LOG:  do_start_worker skip : 230863409.25, 230863414.25, 
230863419.25
LOG:  rebuild_database_list: db=11495, time=230863414.25
LOG:  rebuild_database_list: db=16384, time=230863419.25
...
(no autovacuum activities forever)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



autovacuum-debug.patch
Description: Binary data


autovacuum-fix.patch
Description: Binary data

---(end of broadcast)---
TIP 6: explain analyze is your friend