Re: [HACKERS] will PITR in 8.0 be usable for hot spare/log shipping type

2004-08-11 Thread Brian Hirt
I wonder if there will be assumptions in the startup code concerning 
time.   What if the startup takes 18 months, would there be some sort 
of problem with this approach you think?

On Aug 11, 2004, at 6:14 PM, Gaetano Mendola wrote:
Tom Lane wrote:

Somebody should hack this together and try it during beta.  I don't
have time myself.
Will see, if I have spare time I will try.
Regards
Gaetano Mendola

---(end of 
broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[HACKERS] WARNING: buffer refcount leak

2004-07-26 Thread Brian Hirt
I'm working on a new machine, and i think it's got possible bad 
hardware, since that seems more likely than a bug in postgresql.  I'm  
wondering if someone has any idea what kind of hardware failure might 
cause this message:

WARNING:  buffer refcount leak: [424] (freeNext=425, freePrev=423, 
rel=0/0, blockNum=4294967295, flags=0x1c, refcount=-631 30464)

The one time this happened, postmaster displayed the refcount leak, or 
it would segfault or it crashed with messages like these:  (free(): 
invalid pointer 0xa06ffc0!).  Usually it just works fine, this appears 
to be a very intermittent problem.  We've already replaced the SCA 
backplane, the SCSI cables, the RAID controller, and the motherboard.   
The only components not replaced are the memory and the CPUs.

I've run Memtest86 on the box for several days without it finding any 
bad memory.  It's the first test I run on any new machine.   Can anyone 
recommend any good (free) diagnostics programs like Memtest86 that 
check CPUs, PCI bus, etc, etc.

The machine is a dual xeon 2.8,  4gb ECC ram, and 14 15k 36G U320 
drives with a megaraid 320-2x controller.   running fedora core 1, 
postgres 7.3.4

thanks for any advice, and i hope this isn't too off topic.
--brian
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] pg_autovacuum seems to be a neat freak and cleans way too much

2004-05-18 Thread Brian Hirt
I'm following up on my own email and cross posting to hackers, because  
there is a bug that needs fixed.   I spent some more time digging into  
this, and I found the cause of the problem.

reltuples in pg_class is defined as a real,  reltuples in pg_autovacuum  
is defined as an int.   the query used to get reltuples returns  
scientific notation for my larg tables, '4.06927e+06' for the one i  
mention below.pg_autovacuum happily converts that to a '4' by doing  
atoi('4.06927e+06'), which is why it's all fubar for my large tables  
with over a million tuples.

my real quick hack of changing the define in pg_autovacuum.h to cast  
reltuples to ::int4 makes it work

line: 37
#define TABLE_STATS_QUERY   select  
a.oid,a.relname,a.relnamespace,a.relpages,a.relisshared,a.reltuples:: 
int4,b.schemaname,b.n_tup_ins,b.n_tup_upd,b.n_tup_del from pg_class a,  
pg_stat_all_tables b where a.oid=b.relid and a
.relkind = 'r'

#define PAGES_QUERY select oid,reltuples::int4,relpages from pg_class  
where oid=%i

however, i think a better fix would be to change the autovacuum to use  
a double instead of an int.   if it's going to stay at int, it should  
probably be increased to long and the casts changed to ::int8

any suggestions on how best way to fix?
i'll supply a patch once the approach is agreed upon and the problem  
has been verified.

best regards,
--brian
On May 18, 2004, at 7:37 PM, Brian Hirt wrote:
I've having a strange issue with pg_autovacuum.   I have a table with  
about 4 million rows in 20,000 pages.   autovacuum likes to vacuum  
and/or analyze  it every 45 minutes or so, but it probably doesn't  
have more that a few hundred rows changed every few hours.   when i  
run autovacuum with -d3 it says

[2004-05-18 07:04:26 PM]   table name:  
basement_nightly.public.search_words4
[2004-05-18 07:04:26 PM]  relid: 396238832;   relisshared: 0
[2004-05-18 07:04:26 PM]  reltuples: 4;  relpages: 20013
[2004-05-18 07:04:26 PM]  curr_analyze_count:  0;  
cur_delete_count:   0
[2004-05-18 07:04:26 PM]  ins_at_last_analyze: 0;  
del_at_last_vacuum: 0
[2004-05-18 07:04:26 PM]  insert_threshold:504;  
delete_threshold1008

reltuples: 4 seems wrong.  I would expect a table with 4m rows and 20k  
pages to have more than 4 tuples.   I think this is why the insert  
threshhold is all messed up -- which is why it gets analyzed way too  
frequently.

this happens with other big tables too.   the autovacuum is from  
7.4.2, some information is below.

output from vacuum:
basement=# vacuum ANALYZE verbose search_words4;
INFO:  vacuuming public.search_words4
INFO:  index search_words4_data_id now contains 4069268 row versions  
in 15978 pages
DETAIL:  479 index row versions were removed.
1 index pages have been deleted, 0 are currently reusable.
CPU 0.42s/0.70u sec elapsed 29.48 sec.
INFO:  index search_words4_pkey now contains 4069268 row versions in  
17576 pages
DETAIL:  479 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.77s/0.74u sec elapsed 150.19 sec.
INFO:  search_words4: removed 479 row versions in 6 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  search_words4: found 479 removable, 4069268 nonremovable row  
versions in 19950 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 1.30s/1.61u sec elapsed 179.96 sec.
INFO:  analyzing public.search_words4
INFO:  search_words4: 19950 pages, 3000 rows sampled, 4069800  
estimated total rows
VACUUM
basement=#


here's the frequency
[2004-05-18 12:12:54 PM] Performing: ANALYZE public.search_words4
[2004-05-18 01:59:13 PM] Performing: ANALYZE public.search_words4
[2004-05-18 02:05:36 PM] Performing: ANALYZE public.search_words4
[2004-05-18 02:29:25 PM] Performing: VACUUM ANALYZE  
public.search_words4
[2004-05-18 02:46:09 PM] Performing: ANALYZE public.search_words4
[2004-05-18 03:39:31 PM] Performing: ANALYZE public.search_words4
[2004-05-18 05:20:45 PM] Performing: ANALYZE public.search_words4
[2004-05-18 06:08:03 PM] Performing: VACUUM ANALYZE  
public.search_words4
[2004-05-18 06:18:34 PM] Performing: ANALYZE public.search_words4
[2004-05-18 07:34:27 PM] Performing: ANALYZE public.search_words4
[2004-05-18 07:43:18 PM] Performing: ANALYZE public.search_words4

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] pg_autovacuum seems to be a neat freak and cleans way too much

2004-05-18 Thread Brian Hirt
there might be another similar bug that was fixed in 7.4.2
i just doubled checked the 7.4.2 tarball, and it does have this problem.
you might want to double check to see if it's fixed in 7.4.3, or i can  
grab cvs and check it if you like.

On May 18, 2004, at 8:06 PM, Bruce Momjian wrote:
I think we already fixed that in 7.4.2.  We also have a few bugs still
in 7.4.2 and we need to get those fixed soon and release 7.4.3.
--- 


Brian Hirt wrote:
I'm following up on my own email and cross posting to hackers, because
there is a bug that needs fixed.   I spent some more time digging into
this, and I found the cause of the problem.
reltuples in pg_class is defined as a real,  reltuples in  
pg_autovacuum
is defined as an int.   the query used to get reltuples returns
scientific notation for my larg tables, '4.06927e+06' for the one i
mention below.pg_autovacuum happily converts that to a '4' by  
doing
atoi('4.06927e+06'), which is why it's all fubar for my large tables
with over a million tuples.

my real quick hack of changing the define in pg_autovacuum.h to cast
reltuples to ::int4 makes it work
line: 37
#define TABLE_STATS_QUERY   select
a.oid,a.relname,a.relnamespace,a.relpages,a.relisshared,a.reltuples::
int4,b.schemaname,b.n_tup_ins,b.n_tup_upd,b.n_tup_del from pg_class a,
pg_stat_all_tables b where a.oid=b.relid and a
.relkind = 'r'
#define PAGES_QUERY select oid,reltuples::int4,relpages from pg_class
where oid=%i
however, i think a better fix would be to change the autovacuum to use
a double instead of an int.   if it's going to stay at int, it should
probably be increased to long and the casts changed to ::int8
any suggestions on how best way to fix?
i'll supply a patch once the approach is agreed upon and the problem
has been verified.
best regards,
--brian
On May 18, 2004, at 7:37 PM, Brian Hirt wrote:
I've having a strange issue with pg_autovacuum.   I have a table with
about 4 million rows in 20,000 pages.   autovacuum likes to vacuum
and/or analyze  it every 45 minutes or so, but it probably doesn't
have more that a few hundred rows changed every few hours.   when i
run autovacuum with -d3 it says
[2004-05-18 07:04:26 PM]   table name:
basement_nightly.public.search_words4
[2004-05-18 07:04:26 PM]  relid: 396238832;   relisshared: 0
[2004-05-18 07:04:26 PM]  reltuples: 4;  relpages: 20013
[2004-05-18 07:04:26 PM]  curr_analyze_count:  0;
cur_delete_count:   0
[2004-05-18 07:04:26 PM]  ins_at_last_analyze: 0;
del_at_last_vacuum: 0
[2004-05-18 07:04:26 PM]  insert_threshold:504;
delete_threshold1008
reltuples: 4 seems wrong.  I would expect a table with 4m rows and  
20k
pages to have more than 4 tuples.   I think this is why the insert
threshhold is all messed up -- which is why it gets analyzed way too
frequently.

this happens with other big tables too.   the autovacuum is from
7.4.2, some information is below.
output from vacuum:
basement=# vacuum ANALYZE verbose search_words4;
INFO:  vacuuming public.search_words4
INFO:  index search_words4_data_id now contains 4069268 row  
versions
in 15978 pages
DETAIL:  479 index row versions were removed.
1 index pages have been deleted, 0 are currently reusable.
CPU 0.42s/0.70u sec elapsed 29.48 sec.
INFO:  index search_words4_pkey now contains 4069268 row versions  
in
17576 pages
DETAIL:  479 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.77s/0.74u sec elapsed 150.19 sec.
INFO:  search_words4: removed 479 row versions in 6 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  search_words4: found 479 removable, 4069268 nonremovable row
versions in 19950 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 1.30s/1.61u sec elapsed 179.96 sec.
INFO:  analyzing public.search_words4
INFO:  search_words4: 19950 pages, 3000 rows sampled, 4069800
estimated total rows
VACUUM
basement=#


here's the frequency
[2004-05-18 12:12:54 PM] Performing: ANALYZE public.search_words4
[2004-05-18 01:59:13 PM] Performing: ANALYZE public.search_words4
[2004-05-18 02:05:36 PM] Performing: ANALYZE public.search_words4
[2004-05-18 02:29:25 PM] Performing: VACUUM ANALYZE
public.search_words4
[2004-05-18 02:46:09 PM] Performing: ANALYZE public.search_words4
[2004-05-18 03:39:31 PM] Performing: ANALYZE public.search_words4
[2004-05-18 05:20:45 PM] Performing: ANALYZE public.search_words4
[2004-05-18 06:08:03 PM] Performing: VACUUM ANALYZE
public.search_words4
[2004-05-18 06:18:34 PM] Performing: ANALYZE public.search_words4
[2004-05-18 07:34:27 PM] Performing: ANALYZE public.search_words4
[2004-05-18 07:43:18 PM] Performing: ANALYZE public.search_words4

---(end of  
broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org
--
  Bruce Momjian

[HACKERS] fairly serious bug with pg_autovacuum in pg7.4

2003-11-25 Thread Brian Hirt
Hello,

I've run across a pretty serious problem with pg_autovacuum.
pg_autovacuum looses track of any table that's ever been truncated  
(possibly other situations too).   When i truncate a table it gets a  
new relfilenode in pg_class.  This is a problem because pg_autovacuum  
assumes pg_class.relfilenode will join to pg_stats_all_tables.relid. 
pg_stats_all_tables.relid is actallly the oid from pg_class, not the  
relfilenode.   These two values start out equal so pg_autovacuum works  
initially, but it fails later on because of this incorrect assumption.

here is one query pg_autovacuum uses (from pg_autovacuum.h) to get  
tables that breaks.

select  
a.relfilenode,a.relname,a.relnamespace,a.relpages,a.relisshared,a.reltup 
les,b.schemaname,b
.n_tup_ins,b.n_tup_upd,b.n_tup_del from pg_class a, pg_stat_all_tables  
b where a.relfilenode=b.relid and a.relkind = 'r'

here's a little test case you can use to see what happens:

basement=# create table test_table ( id int4 );
CREATE TABLE
basement=# select relname, relfilenode from pg_class where relkind =  
'r' and relname = 'test_table';
  relname   | relfilenode
+-
 test_table |28814151
(1 row)

basement=# select relid,relname from pg_stat_all_tables where relname =  
'test_table';
  relid   |  relname
--+
 28814151 | test_table
(1 row)

basement=# select  
a.relfilenode,a.relname,a.relnamespace,a.relpages,a.relisshared,a.reltup 
les,b.schemaname,
basement-# b.n_tup_ins,b.n_tup_upd,b.n_tup_del from pg_class a,  
pg_stat_all_tables b
basement-# where a.relfilenode=b.relid and a.relkind = 'r' and  
a.relname = 'test_table';
 relfilenode |  relname   | relnamespace | relpages | relisshared |  
reltuples | schemaname | n_tup_ins | n_tup_upd | n_tup_del
-++--+--+- 
+---++---+---+---
28814151 | test_table | 2200 |   10 | f   |  
 1000 | public | 0 | 0 | 0
(1 row)

basement=#
basement=# truncate table test_table;
TRUNCATE TABLE
basement=# select relname, relfilenode from pg_class where relkind =  
'r' and relname = 'test_table';
  relname   | relfilenode
+-
 test_table |28814153
(1 row)

basement=# select relid,relname from pg_stat_all_tables where relname =  
'test_table';
  relid   |  relname
--+
 28814151 | test_table
(1 row)

basement=# select  
a.relfilenode,a.relname,a.relnamespace,a.relpages,a.relisshared,a.reltup 
les,b.schemaname,
basement-# b.n_tup_ins,b.n_tup_upd,b.n_tup_del from pg_class a,  
pg_stat_all_tables b
basement-# where a.relfilenode=b.relid and a.relkind = 'r' and  
a.relname = 'test_table';
 relfilenode | relname | relnamespace | relpages | relisshared |  
reltuples | schemaname | n_tup_ins | n_tup_upd | n_tup_del
-+-+--+--+- 
+---++---+---+---
(0 rows)

basement=# drop table test_table;
DROP TABLE
basement=#
PS: i'm running pg-7.4 and pg_autovacuum from contrib.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[HACKERS] Problem with plpgsql functions and foreign key constraints.

2001-05-28 Thread Brian Hirt

I recently ran across this (i think) bug relating to constraints and
functions written in plpgsql. It seems that I'm getting erroneous foreign
key violations.  I've included two scripts which create the simplest test
case I can reproduce.  One script has a foreign key defined and the other
one doesn't.  Other than that, they are identical.  From the data in the
scripts, it's obvious there aren't any violations of keys.

[postgres@loopy postgres]$ diff /tmp/good.sql /tmp/bad.sql
18c18
 create table c2 ( id int, value_sum int);
---
 create table c2 ( id int references c(id), value_sum int);

[postgres@loopy postgres]$ psql test  /tmp/good.sql
NOTICE:  CREATE TABLE/PRIMARY KEY will create implicit index 'c_pkey' for
table 'c'
CREATE
INSERT 19107 1
INSERT 19108 1
NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY
check(s)
CREATE
INSERT 19126 1
INSERT 19127 1
INSERT 19128 1
INSERT 19129 1
INSERT 19130 1
INSERT 19131 1
CREATE
CREATE
CREATE
UPDATE 6
 id | value_sum
+---
  1 | 6
  2 |17
(2 rows)

 id

  1
  2
(2 rows)

 id

  1
  2
(2 rows)

 id

  1
  2
(2 rows)
[postgres@loopy postgres]$ psql test  /tmp/bad.sql
NOTICE:  CREATE TABLE/PRIMARY KEY will create implicit index 'c_pkey' for
table 'c'
CREATE
INSERT 19164 1
INSERT 19165 1
NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY
check(s)
CREATE
INSERT 19183 1
INSERT 19184 1
INSERT 19185 1
INSERT 19186 1
INSERT 19187 1
INSERT 19188 1
NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY
check(s)
CREATE
CREATE
CREATE
ERROR:  triggered data change violation on relation c2
 id | value_sum
+---
(0 rows)

 id

  1
  2
(2 rows)

 id

  1
  2
(2 rows)

 id

(0 rows)



 good.sql
 bad.sql


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



[HACKERS] Fw: Problem with plpgsql functions and foreign key constraints.

2001-05-23 Thread Brian Hirt

I forgot to mention that this is happening on 7.0.3and 7.1.1  -- and I'm
running on a RedHat 7.0 machine.

- Original Message -
From: Brian Hirt [EMAIL PROTECTED]
To: Postgres Hackers [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Sunday, June 24, 2001 1:12 AM
Subject: Problem with plpgsql functions and foreign key constraints.


 I recently ran across this (i think) bug relating to constraints and
 functions written in plpgsql. It seems that I'm getting erroneous foreign
 key violations.  I've included two scripts which create the simplest test
 case I can reproduce.  One script has a foreign key defined and the other
 one doesn't.  Other than that, they are identical.  From the data in the
 scripts, it's obvious there aren't any violations of keys.

 [postgres@loopy postgres]$ diff /tmp/good.sql /tmp/bad.sql
 18c18
  create table c2 ( id int, value_sum int);
 ---
  create table c2 ( id int references c(id), value_sum int);

 [postgres@loopy postgres]$ psql test  /tmp/good.sql
 NOTICE:  CREATE TABLE/PRIMARY KEY will create implicit index 'c_pkey' for
 table 'c'
 CREATE
 INSERT 19107 1
 INSERT 19108 1
 NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY
 check(s)
 CREATE
 INSERT 19126 1
 INSERT 19127 1
 INSERT 19128 1
 INSERT 19129 1
 INSERT 19130 1
 INSERT 19131 1
 CREATE
 CREATE
 CREATE
 UPDATE 6
  id | value_sum
 +---
   1 | 6
   2 |17
 (2 rows)

  id
 
   1
   2
 (2 rows)

  id
 
   1
   2
 (2 rows)

  id
 
   1
   2
 (2 rows)
 [postgres@loopy postgres]$ psql test  /tmp/bad.sql
 NOTICE:  CREATE TABLE/PRIMARY KEY will create implicit index 'c_pkey' for
 table 'c'
 CREATE
 INSERT 19164 1
 INSERT 19165 1
 NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY
 check(s)
 CREATE
 INSERT 19183 1
 INSERT 19184 1
 INSERT 19185 1
 INSERT 19186 1
 INSERT 19187 1
 INSERT 19188 1
 NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY
 check(s)
 CREATE
 CREATE
 CREATE
 ERROR:  triggered data change violation on relation c2
  id | value_sum
 +---
 (0 rows)

  id
 
   1
   2
 (2 rows)

  id
 
   1
   2
 (2 rows)

  id
 
 (0 rows)




 good.sql
 bad.sql


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



[HACKERS] Question about performance of planner

2000-11-22 Thread Brian Hirt

Hi,

I have a question about the performance of the planner in 7.1.  I've been
testing the 11/21 snapshot of the database just to get an idea of how it
will work for me when I upgrade from 7.02   I've noticed that some queries 
are taking much longer and I've narrowed it down (i think) to the planner.

I've run an identical query against 7.02 and 7.1.  Both databases have the exact
same data, and both databases have been vacuum'd.  As you can see from below,
the 7.1 snapshot is spending 97% of the total time planning the query, where
the 7.0.2 version is spending only 27% of the total time planning the query.

If anyone is interested in this, I'll be happy to supply you with information
that would help track this down.


Thanks.

7.1-snapshot
PLANNER STATISTICS
! system usage stats:
!   7.748602 elapsed 5.02 user 0.20 system sec
!   [5.09 user 0.21 sys total]
!   0/0 [0/0] filesystem blocks in/out
!   47/1246 [349/1515] page faults/reclaims, 0 [0] swaps
!   0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!   0/0 [0/0] voluntary/involuntary context switches
! postgres usage stats:
!   Shared blocks: 20 read,  0 written, buffer hit rate = 99.94%
!   Local  blocks:  0 read,  0 written, buffer hit rate = 0.00%
!   Direct blocks:  0 read,  0 written
EXECUTOR STATISTICS
! system usage stats:
!   0.317000 elapsed 0.16 user 0.01 system sec
!   [5.25 user 0.22 sys total]
!   0/0 [0/0] filesystem blocks in/out
!   328/364 [677/1879] page faults/reclaims, 0 [0] swaps
!   0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!   0/0 [0/0] voluntary/involuntary context switches
! postgres usage stats:
!   Shared blocks:160 read,  0 written, buffer hit rate = 97.73%
!   Local  blocks:  0 read,  0 written, buffer hit rate = 0.00%
!   Direct blocks:  0 read,  0 written


7.0.2
! Planner Stats:
! system usage stats:
!   0.051438 elapsed 0.05 user 0.00 system sec
!   [0.33 user 0.05 sys total]
!   0/0 [0/0] filesystem blocks in/out
!   0/51 [680/837] page faults/reclaims, 0 [0] swaps
!   0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!   0/0 [0/0] voluntary/involuntary context switches
! postgres usage stats:
!   Shared blocks:  0 read,  0 written, buffer hit rate = 100.00%
!   Local  blocks:  0 read,  0 written, buffer hit rate = 0.00%
!   Direct blocks:  0 read,  0 written
! Executor Stats:
! system usage stats:
!   0.136506 elapsed 0.13 user 0.00 system sec
!   [0.46 user 0.05 sys total]
!   0/0 [0/0] filesystem blocks in/out
!   0/6 [680/843] page faults/reclaims, 0 [0] swaps
!   0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!   0/0 [0/0] voluntary/involuntary context switches
! postgres usage stats:
!   Shared blocks: 98 read,  0 written, buffer hit rate = 98.98%
!   Local  blocks:  0 read,  0 written, buffer hit rate = 0.00%
!   Direct blocks:  0 read,  0 written


-- 
The world's most ambitious and comprehensive PC game database project.

  http://www.mobygames.com