Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-14 Thread Greg Stark
Dan Harris [EMAIL PROTECTED] writes:

 I keep the entire database vacuumed regularly.

How often is regularly? We get frequent posts from people who think daily or
every 4 hours is often enough. If the table is very busy you can need vacuums
as often as every 15 minutes. 

Also, if you've done occasional massive batch updates like you describe here
you may need a VACUUM FULL or alternatively a CLUSTER command to compact the
table -- vacuum identifies the free space but if you've doubled the size of
your table with a large update that's a lot more free space than you want
hanging around waiting to be used.

 For example, as I'm writing this, I am running an UPDATE statement  that will
 affect a small part of the table, and is querying on an  indexed boolean 
 field.
...
 update eventactivity set ftindex = false where ftindex = true;  ( added the
 where clause because I don't want to alter where ftindex  is null )

It's definitely worthwhile doing an EXPLAIN UPDATE... to see if this even
used the index. It sounds like it did a sequential scan.

Sequential scans during updates are especially painful. If there isn't free
space lying around in the page where the updated record lies then another page
has to be used or a new page added. If you're doing a massive update you can
exhaust the free space available making the update have to go back and forth
between the page being read and the end of the table where pages are being
written.

 #
 
 vmstat output ( as I am waiting for this to finish ):
 procs ---memory-- ---swap-- -io --system--
 cpu
 r  b   swpd   freebuff   cache   si   sobibo   incs  us sy id 
 wa
 0  1   5436 2823908  26140 918370401  2211   540  694   336   9  2 76 
 13

[I assume you ran vmstat 10 or some other interval and then waited for at
least the second line? The first line outputted from vmstat is mostly
meaningless]

Um. That's a pretty meager i/o rate. Just over 2MB/s. The cpu is 76% idle
which sounds fine but that could be one processor pegged at 100% while the
others are idle. If this query is the only one running on the system then it
would behave just like that.

Is it possible you have some foreign keys referencing these records that
you're updating? In which case every record being updated might be causing a
full table scan on another table (or multiple other tables). If those tables
are entirely in cache then it could cause these high cpu low i/o symptoms.

Or are there any triggers on this table?


-- 
greg


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-14 Thread Alvaro Herrera
On Thu, Jul 14, 2005 at 12:28:05AM -0600, Dan Harris wrote:

 Ok, so I remounted this drive as ext2 shortly before sending my first  
 email today.  It wasn't enough time for me to notice the ABSOLUTELY  
 HUGE difference in performance change.  Ext3 must really be crappy  
 for postgres, or at least is on this box.  Now that it's ext2, this  
 thing is flying like never before.   My CPU utilization has  
 skyrocketed, telling me that the disk IO was constraining it immensely.

Were you using the default journal settings for ext3?

An interesting experiment would be to use the other journal options
(particularly data=writeback).  From the mount manpage:

   data=journal / data=ordered / data=writeback
  Specifies  the  journalling  mode  for  file  data.  Metadata is
  always journaled.  To use modes other than ordered on  the  root
  file system, pass the mode to the kernel as boot parameter, e.g.
  rootflags=data=journal.

  journal
 All data is committed into the  journal  prior  to  being
 written into the main file system.

  ordered
 This  is  the  default mode.  All data is forced directly
 out to the main file system prior to its  metadata  being
 committed to the journal.

  writeback
 Data ordering is not preserved - data may be written into
 the main file system after its metadata has been  commit-
 ted  to the journal.  This is rumoured to be the highest-
 throughput option.  It guarantees  internal  file  system
 integrity,  however  it  can  allow old data to appear in
 files after a crash and journal recovery.


-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
Officer Krupke, what are we to do?
Gee, officer Krupke, Krup you! (West Side Story, Gee, Officer Krupke)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-14 Thread Dan Harris


On Jul 14, 2005, at 9:47 AM, Alvaro Herrera wrote:


On Thu, Jul 14, 2005 at 12:28:05AM -0600, Dan Harris wrote:


.  Ext3 must really be crappy
for postgres, or at least is on this box.


Were you using the default journal settings for ext3?


Yes, I was.  Next time I get a chance to reboot this box, I will try  
writeback and compare the benchmarks to my previous config.  Thanks  
for the tip.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-14 Thread Greg Stark
Dan Harris [EMAIL PROTECTED] writes:

 Well, once every day, but there aren't a ton of inserts or updates  going on a
 daily basis.  Maybe 1,000 total inserts?

It's actually deletes and updates that matter. not inserts.

 I have a feeling I'm going to need to do a cluster soon.  I have done  several
 mass deletes and reloads on it.

CLUSTER effectively does a VACUUM FULL but takes a different approach and
writes out a whole new table, which if there's lots of free space is faster
than moving records around to compact the table.

 I tried that, and indeed it was using an index, although after  reading 
 Simon's
 post, I realize that was kind of dumb to have an  index on a bool. I have 
 since
 removed it.

If there are very few records (like well under 10%) with that column equal to
false (or very few equal to true) then it's not necessarily useless. But
probably more useful is a partial index on some other column.

Something like 

CREATE INDEX ON pk WHERE flag = false;

 No foreign keys or triggers.

Note that I'm talking about foreign keys in *other* tables that refer to
columns in this table. Every update on this table would have to scan those
other tables looking for records referencing the updated rows.


 Ok, so I remounted this drive as ext2 shortly before sending my first  email
 today.  It wasn't enough time for me to notice the ABSOLUTELY  HUGE difference
 in performance change.  Ext3 must really be crappy  for postgres, or at least
 is on this box.  Now that it's ext2, this  thing is flying like never before.
 My CPU utilization has  skyrocketed, telling me that the disk IO was
 constraining it immensely.
 
 I always knew that it might be a little faster, but the box feels  like it can
 breathe again and things that used to be IO intensive  and run for an hour 
 or
 more are now running in  5 minutes.  I'm a  little worried about not having a
 journalized file system, but that  performance difference will keep me from
 switching back ( at least to  ext3! ).  Maybe someday I will try XFS.

@spock(Fascinating).

I wonder if ext3 might be issuing IDE cache flushes on every fsync (to sync
the journal) whereas ext2 might not be issuing any cache flushes at all.

If the IDE cache is never being flushed then you'll see much better
performance but run the risk of data loss in a power failure or hardware
failure. (But not in the case of an OS crash, or at least no more than
otherwise.)

You could also try using the -O journal_dev option to put the ext3 journal
on a separate device.

-- 
greg


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Quad Opteron stuck in the mud

2005-07-13 Thread Dan Harris

Gurus,

A table in one of my databases has just crossed the 30 million row  
mark and has begun to feel very sluggish for just about anything I do  
with it.  I keep the entire database vacuumed regularly.  And, as  
long as I'm not doing a sequential scan, things seem reasonably quick  
most of the time.  I'm now thinking that my problem is IO because  
anything that involves heavy ( like a seq scan ) IO seems to slow to  
a crawl.  Even if I am using indexed fields to grab a few thousand  
rows, then going to sequential scans it gets very very slow.


I have also had the occurrence where queries will not finish for days  
( I eventually have to kill them ).  I was hoping to provide an  
explain analyze for them, but if they never finish... even the  
explain never finishes when I try that.


For example, as I'm writing this, I am running an UPDATE statement  
that will affect a small part of the table, and is querying on an  
indexed boolean field.


I have been waiting for over an hour and a half as I write this and  
it still hasn't finished.  I'm thinking I bet Tom, Simon or Josh  
wouldn't put up with this kind of wait time.., so I thought I would  
see if anyone here had some pointers.  Maybe I have a really stupid  
setting in my conf file that is causing this.  I really can't believe  
I am at the limits of this hardware, however.



The query:
update eventactivity set ftindex = false where ftindex = true;  
( added the where clause because I don't want to alter where ftindex  
is null )




The table:
  Column|Type | Modifiers
-+-+---
entrydate   | timestamp without time zone |
incidentid  | character varying(40)   |
statustype  | character varying(20)   |
unitid  | character varying(20)   |
recordtext  | character varying(255)  |
recordtext2 | character varying(255)  |
insertdate  | timestamp without time zone |
ftindex | boolean |
Indexes: eventactivity1 btree (incidentid),
 eventactivity_entrydate_idx btree (entrydate),
 eventactivity_ftindex_idx btree (ftindex),
 eventactivity_oid_idx btree (oid)




The hardware:

4 x 2.2GHz Opterons
12 GB of RAM
4x10k 73GB Ultra320 SCSI drives in RAID 0+1
1GB hardware cache memory on the RAID controller

The OS:
Fedora, kernel 2.6.6-1.435.2.3smp ( redhat stock kernel )
filesystem is mounted as ext2

#

vmstat output ( as I am waiting for this to finish ):
procs ---memory-- ---swap-- -io --system--  
cpu
r  b   swpd   free   buff  cache   si   sobibo   incs us  
sy id wa
0  1   5436 2823908  26140 918370401  2211   540  694   336   
9  2 76 13


#

iostat output ( as I am waiting for this to finish ):
avg-cpu:  %user   %nice%sys %iowait   %idle
   9.190.002.19   13.08   75.53

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
cciss/c0d0  329.26 17686.03  4317.57  161788630   39496378


#
This is a dedicated postgresql server, so maybe some of these  
settings are more liberal than they should be?


relevant ( I hope ) postgresql.conf options are:

shared_buffers = 5
effective_cache_size = 1348000
random_page_cost = 3
work_mem = 512000
max_fsm_pages = 8
log_min_duration_statement = 6
fsync = true ( not sure if I'm daring enough to run without this )
wal_buffers = 1000
checkpoint_segments = 64
checkpoint_timeout = 3000


# FOR PG_AUTOVACUUM --#
stats_command_string = true
stats_row_level = true

Thanks in advance,
Dan








---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-13 Thread Dan Harris


On Jul 13, 2005, at 1:11 PM, John A Meinel wrote:



I might be wrong, but there may be something much more substantially
wrong than slow i/o.
John



Yes, I'm afraid of that too.  I just don't know what tools I should  
use to figure that out.  I have some 20 other databases on this  
system, same schema but varying sizes, and the small ones perform  
very well.  It feels like there is an O(n) increase in wait time that  
has recently become very noticeable on the largest of them.


-Dan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-13 Thread Stephen Frost
* Dan Harris ([EMAIL PROTECTED]) wrote:
 On Jul 13, 2005, at 1:11 PM, John A Meinel wrote:
 I might be wrong, but there may be something much more substantially
 wrong than slow i/o.
 
 Yes, I'm afraid of that too.  I just don't know what tools I should  
 use to figure that out.  I have some 20 other databases on this  
 system, same schema but varying sizes, and the small ones perform  
 very well.  It feels like there is an O(n) increase in wait time that  
 has recently become very noticeable on the largest of them.

Could you come up w/ a test case that others could reproduce where
explain isn't returning?  I think that would be very useful towards
solving at least that issue...

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-13 Thread Alvaro Herrera
On Wed, Jul 13, 2005 at 01:16:25PM -0600, Dan Harris wrote:

 On Jul 13, 2005, at 1:11 PM, John A Meinel wrote:
 
 I might be wrong, but there may be something much more substantially
 wrong than slow i/o.
 
 Yes, I'm afraid of that too.  I just don't know what tools I should  
 use to figure that out.  I have some 20 other databases on this  
 system, same schema but varying sizes, and the small ones perform  
 very well.  It feels like there is an O(n) increase in wait time that  
 has recently become very noticeable on the largest of them.

I'd guess it's stuck on some lock.  Try that EXPLAIN, and when it
blocks, watch the pg_locks view for locks not granted to the process
executing the EXPLAIN.  Then check what else is holding the locks.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
La rebeldía es la virtud original del hombre (Arthur Schopenhauer)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-13 Thread Dan Harris


On Jul 13, 2005, at 2:17 PM, Stephen Frost wrote:


Could you come up w/ a test case that others could reproduce where
explain isn't returning?


This was simply due to my n00bness :)  I had always been doing  
explain analyze, instead of just explain.  Next time one of these  
queries comes up, I will be sure to do the explain without analyze.


FYI that update query I mentioned in the initial thread just finished  
after updating 8.3 million rows.


-Dan


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-13 Thread Vivek Khera


On Jul 13, 2005, at 2:54 PM, Dan Harris wrote:


4 x 2.2GHz Opterons
12 GB of RAM
4x10k 73GB Ultra320 SCSI drives in RAID 0+1
1GB hardware cache memory on the RAID controller



if it is taking that long to update about 25% of your table, then you  
must be I/O bound. check I/o while you're running a big query.


also, what RAID controller are you running?  be sure you have the  
latest BIOS and drivers for it.


on a pair of dual opterons, I can do large operations on tables with  
100 million rows much faster than you seem to be able.  I have  
MegaRAID 320-2x controllers with 15kRPM drives.


Vivek Khera, Ph.D.
+1-301-869-4449 x806




smime.p7s
Description: S/MIME cryptographic signature


Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-13 Thread Simon Riggs
On Wed, 2005-07-13 at 12:54 -0600, Dan Harris wrote:
 For example, as I'm writing this, I am running an UPDATE statement  
 that will affect a small part of the table, and is querying on an  
 indexed boolean field.

An indexed boolean field?

Hopefully, ftindex is false for very few rows of the table?

Try changing the ftindex to be a partial index, so only index the false
values. Or don't index it at all.

Split the table up into smaller pieces.

Don't use an UPDATE statement. Keep a second table, and insert records
into it when you would have updated previously. If a row is not found,
you know that it has ftindex=true. That way, you'll never have row
versions building up in the main table, which you'll still get even if
you VACUUM.

Best Regards, Simon Riggs




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly