from:"Dan Harris"

[PERFORM] Contemplating SSD Hardware RAID

2011-06-20 Thread Dan Harris

I'm looking for advice from the I/O gurus who have been in the SSD game 
for a while now.


I understand that the majority of consumer grade SSD drives lack the 
required capacitor to complete a write on a sudden power loss.  But, 
what about pairing up with a hardware controller with BBU write cache?  
Can the write cache be disabled at the drive and result in a safe setup?


I'm exploring the combination of an Areca 1880ix-12 controller with 6x 
OCZ Vertex 3 V3LT-25SAT3 2.5" 240GB SATA III drives in RAID-10.  Has 
anyone tried this combination?  What nasty surprise am I overlooking here?


Thanks
-Dan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Linux I/O schedulers - CFQ & random seeks

2011-03-04 Thread Dan Harris


On 3/4/11 11:03 AM, Wayne Conrad wrote:

On 03/04/11 10:34, Glyn Astill wrote:
> I'm wondering (and this may be a can of worms) what peoples opinions 
are on these schedulers?


When testing our new DB box just last month, we saw a big improvement 
in bonnie++ random I/O rates when using the noop scheduler instead of 
cfq (or any other).  We've got RAID 10/12 on a 3ware card w/ 
battery-backed cache; 7200rpm drives.  Our file system is XFS with 
noatime,nobarrier,logbufs=8,logbsize=256k.  How much is "big?"  I 
can't find my notes for it, but I recall that the difference was large 
enough to surprise us.  We're running with noop in production right 
now.  No complaints.


Just another anecdote, I found that the deadline scheduler performed the 
best for me.  I don't have the benchmarks anymore but deadline vs cfq 
was dramatically faster for my tests.  I posted this to the list years 
ago and others announced similar experiences.  Noop was a close 2nd to 
deadline.


XFS (noatime,nodiratime,nobarrier,logbufs=8)
391GB db cluster directory
BBU Caching RAID10 12-disk SAS
128GB RAM
Constant insert stream
OLAP-ish query patterns
Heavy random I/O


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Dan Harris


 On 10/12/10 4:33 PM, Neil Whelchel wrote:

On Tuesday 12 October 2010 08:39:19 Dan Harris wrote:

   On 10/11/10 8:02 PM, Scott Carey wrote:

would give you a 1MB read-ahead.  Also, consider XFS and its built-in
defragmentation.  I have found that a longer lived postgres DB will get
extreme file fragmentation over time and sequential scans end up mostly
random.  On-line file defrag helps tremendously.

We just had a corrupt table caused by an XFS online defrag.  I'm scared
to use this again while the db is live.  Has anyone else found this to
be safe?  But, I can vouch for the fragmentation issue, it happens very
quickly in our system.

-Dan

I would like to know the details of what was going on that caused your
problem. I have been using XFS for over 9 years, and it has never caused any
trouble at all in a production environment. Sure, I had many problems with it
on the test bench, but in most cases the issues were very clear and easy to
avoid in production. There were some (older) XFS tools that caused some
problems, but that is in the past, and as time goes on, it seems take less and
less planning to make it work properly.
-Neil-

There were roughly 50 transactions/sec going on at the time I ran it.  
xfs_db reported 99% fragmentation before it ran ( we haven't been 
running it via cron ).  The operation completed in about 15 minutes ( 
360GB of used data on the file system ) with no errors.  Everything 
seemed fine until the next morning when a user went to query a table we 
got a message about a "missing" file inside the pg cluster.  We were 
unable to query the table at all via psql.  It was a bit of a panic 
situation so we restored that table from backup immediately and the 
problem was solved without doing more research.


This database has been running for years with no problem ( and none 
since ), that was the first time I tried to do an on-line defrag and 
that was the only unusual variable introduced into the system at that 
time so it was a strong enough correlation for me to believe that caused 
it.  Hopefully this was just a corner case..


-Dan


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Dan Harris


 On 10/12/10 10:44 AM, Scott Carey wrote:

On Oct 12, 2010, at 8:39 AM, Dan Harris wrote:


  On 10/11/10 8:02 PM, Scott Carey wrote:

would give you a 1MB read-ahead.  Also, consider XFS and its built-in 
defragmentation.  I have found that a longer lived postgres DB will get extreme
file fragmentation over time and sequential scans end up mostly random.  
On-line file defrag helps tremendously.


We just had a corrupt table caused by an XFS online defrag.  I'm scared
to use this again while the db is live.  Has anyone else found this to
be safe?  But, I can vouch for the fragmentation issue, it happens very
quickly in our system.


What version?  I'm using the latest CentoOS extras build.

We've been doing online defrag for a while now on a very busy database with>  
8TB of data.  Not that that means there are no bugs...

It is a relatively simple thing in xfs -- it writes a new file to temp in a way 
that allocates contiguous space if available, then if the file has not been 
modified since it was re-written it is essentially moved on top of the other 
one.  This should be safe provided the journaling and storage is safe, etc.

I'm not sure how to figure out what version of XFS we're on.. but it's 
Linux kernel 2.6.24-24 x86_64 on Ubuntu Server 8.04.3.  Postgres version 8.3


We're due for an upgrade on that server soon so we'll do some more 
testing once we upgrade.  Right now we are just living with the 
fragmentation.  I'm glad to hear the regular on-line defrag is working 
successfully, at least that gives me hope we can rely on it in the future.


-Dan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Dan Harris


 On 10/11/10 8:02 PM, Scott Carey wrote:

would give you a 1MB read-ahead.  Also, consider XFS and its built-in 
defragmentation.  I have found that a longer lived postgres DB will get extreme
file fragmentation over time and sequential scans end up mostly random.  
On-line file defrag helps tremendously.

We just had a corrupt table caused by an XFS online defrag.  I'm scared 
to use this again while the db is live.  Has anyone else found this to 
be safe?  But, I can vouch for the fragmentation issue, it happens very 
quickly in our system.


-Dan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] large dataset with write vs read clients

2010-10-07 Thread Dan Harris


 On 10/7/10 11:47 AM, Aaron Turner wrote:



Basically, each connection is taking about 100MB resident.  As we need
to increase the number of threads to be able to query all the devices
in the 5 minute window, we're running out of memory.
I think the first thing to do is look into using a connection pooler 
like pgpool to reduce your connection memory overhead.


-Dan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Got that new server, now it's time for config!

2010-03-22 Thread Dan Harris


On 3/22/10 4:36 PM, Carlo Stonebanks wrote:

Here we go again!

Can anyone see any obvious faults?

Carlo

maintenance_work_mem = 256MB
I'm not sure how large your individual tables are, but you might want to 
bump this value up to get faster vacuums.

max_fsm_relations = 1000

I think this will definitely need to be increased

work_mem = 64MB
Most data warehousing loads I can think of will need more work_mem, but 
this depends on how large of data sets you are planning to sort.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] The state of PG replication in 2008/Q2?

2008-08-21 Thread Dan Harris

My company finally has the means to install a new database server for 
replication.  I have Googled and found a lot of sparse information out 
there regarding replication systems for PostgreSQL and a lot of it looks 
very out-of-date.  Can I please get some ideas from those of you that 
are currently using fail-over replication systems?  What advantage does 
your solution have?  What are the "gotchas" I need to worry about?


My desire would be to have a parallel server that could act as a hot 
standby system with automatic fail over in a multi-master role.  If our 
primary server goes down for whatever reason, the secondary would take 
over and handle the load seamlessly.  I think this is really the "holy 
grail" scenario and I understand how difficult it is to achieve.  
Especially since we make frequent use of sequences in our databases.  If 
MM is too difficult, I'm willing to accept a hot-standby read-only 
system that will handle queries until we can fix whatever ails the master. 

We are primary an OLAP environment but there is a constant stream of 
inserts into the databases.  There are 47 different databases hosted on 
the primary server and this number will continue to scale up to whatever 
the server seems to support.  The reason I mention this number is that 
it seems that those systems that make heavy use of schema changes 
require a lot of "fiddling".  For a single database, this doesn't seem 
too problematic, but any manual work involved and administrative 
overhead will scale at the same rate as the database count grows and I 
certainly want to minimize as much fiddling as possible.


We are using 8.3 and the total combined size for the PG data directory 
is 226G.  Hopefully I didn't neglect to include more relevant information.


As always, thank you for your insight.

-Dan



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] query performance question

2008-06-05 Thread Dan Harris


Kenneth Marshall wrote:

Dan,

Did you try this with 8.3 and its new HOT functionality? 


Ken
  
I did not.  I had to come up with the solution before we were able to 
move to 8.3.  But, Tom did mention that the HOT might help and I forgot 
about that when writing the prior message.   I'm in the midst of moving 
30 databases from 8.0 to 8.3 at the moment but when I'm finished, I 
might have time to test it.


-Dan


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] query performance question

2008-06-05 Thread Dan Harris


[EMAIL PROTECTED] wrote:


3) Build a table with totals or maybe subtotals, updated by triggers. This
requires serious changes in application as well as in database, but solves
issues of 1) and may give you even better results.

Tomas

  
I have tried this.  It's not a magic bullet.  We do our billing based on 
counts from huge tables, so accuracy is important to us.  I tried 
implementing such a scheme and ended up abandoning it because the 
summary table became so full of dead tuples during and after large bulk 
inserts that it slowed down selects on that table to an unacceptable 
speed.  Even with a VACUUM issued every few hundred inserts, it still 
bogged down due to the constant churn of the inserts. 

I ended up moving this count tracking into the application level.  It's 
messy and only allows a single instance of an insert program due to the 
localization of the counts in program memory, but it was the only way I 
found to avoid the penalty of constant table churn on the triggered inserts.


-Dan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Planning a new server - help needed

2008-03-28 Thread Dan Harris


Laszlo Nagy wrote:


Question 4. How to make the partitions? This is the hardest question. 
Here is my plan:


- the OS resides on 2 disks, RAID 1
- the databases should go on 8 disks, RAID 0 + 1
Make sure you understand the difference between RAID 1+0 and RAID 0+1.. 
I suspect you'll end up going with 1+0 instead.


-Dan



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Minimizing dead tuples caused by update triggers

2007-12-19 Thread Dan Harris

I've been fighting with the common workarounds for inadequate response  
times on select count(*) and min(),max() on tables with tens of  
millions of rows for quite a while now and understand the reasons for  
the table scans.


I have applications that regularly poll a table ( ideally, the more  
frequent, the better ) to learn the most recent data inside it as well  
as the # of rows in it ( among a few other things ).  As the databases  
have grown in size, these summarizations could no longer be done on  
the fly, so I wrote a database wrapper API that tracks those values  
internally.


This wrapper has grown very complex and is difficult to manage across  
different systems.  What I'd like to do instead is implement triggers  
for insert, updates, and deletes to check and/or replace a value in a  
"table_stats", representing table count, min/max dates, and a few  
other costly operations.. that can then be queried in short order.  I  
know this is a fairly common thing to do.


The thing that concerns me is dead tuples on the table_stats table.  I  
believe that every insert of new data in one of the monitored tables  
will result in an UPDATE of the table_stats table.  When thousands  
( or millions ) of rows are inserted, the select performance ( even  
trying with an index ) on table_stats slows down in a hurry.  If I  
wrap the inserts into large transactions, will it only call the update  
on table_states when I commit?


Obviously I want to vacuum this table regularly to recover this.  The  
problem I'm running into is contention between VACUUM ( not full ) and  
pg_dump ( version 8.0.12 ).  My system backups takes 6 hours to run  
pg_dump on a 400GB cluster directory.  If the vacuum command fires  
during the dump, it forces an exclusive lock and any queries will hang  
until pg_dump finishes.


If I have to wait until pg_dump is finished before issuing the VACUUM  
command, everything slows down significantly as the dead tuples in  
table_stats pile up.


What strategy could I employ to either:

1. resolve the contention between pg_dump and vacuum, or
2. reduce the dead tuple pile up between vacuums

Thanks for reading

-Dan

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

[PERFORM] pg_dump blocking create database?

2007-09-12 Thread Dan Harris

My PG server came to a screeching halt yesterday.  Looking at top saw a very 
large number of "startup waiting" tasks.   A pg_dump was running and one of my 
scripts had issued a CREATE DATABASE command.  It looks like the CREATE DATABASE 
was exclusive but was having to wait for the pg_dump to finish, causing a 
massive traffic jam of locks behind it.


Once I killed the pg_dump process, things returned to normal.

Version 8.0.12.  Is this a bug? It also concerns me because my workload is 
likely to expose this problem again.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Performance problem with table containing a lot of text (blog)

2007-08-28 Thread Dan Harris


Kari Lavikka wrote:

Hello!

Some background info.. We have a blog table that contains about eight 
million blog entries. Average length of an entry is 1200 letters. 
Because each 8k page can accommodate only a few entries, every query 
that involves several entries causes several random seeks to disk.  We 
are having problems with queries like:


1) give me a list of months when I have written someting
2) give me id's of entries I have written on month X year X
3) give me the number of blog entries my friends have written since last
   time


I didn't see your schema, but couldn't these problems be solved by storing the 
article id, owner id, and blog date in a separate table?  It seems that if you 
don't actually need the content of the blogs, all of those questions could be 
answered by querying a very simple table with minimal I/O overhead.




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Simple query showing 270 hours of CPU time

2007-07-20 Thread Dan Harris


Tom Lane wrote:

Dan Harris <[EMAIL PROTECTED]> writes:

Here's the strace summary as run for a few second sample:



% time seconds  usecs/call callserrors syscall
-- --- --- - - 
  97.250.671629  92  7272   semop
   1.760.012171 40630   recvfrom
   0.570.003960  6660   gettimeofday
   0.360.002512  2890   sendto
   0.050.000317  1032   lseek
   0.010.49   148   select
-- --- --- - - 
100.000.690638  7532   total



Here's the query:
select id from eventkeywords where word = '3322'


How sure are you that (a) that's really what it's doing and (b) you are
not observing multiple executions of the query?  There are no recvfrom
calls in the inner loops of the backend AFAIR, so this looks to me like
the execution of 30 different queries.  The number of semops is
distressingly high, but that's a contention issue not an
amount-of-runtime issue.  


You were absolutely right.  This is one connection that is doing a whole lot of 
( slow ) queries.  I jumped the gun on this and assumed it was a single query 
taking this long.  Sorry to waste time and bandwidth.


Since you mentioned the number of semops is distressingly high, does this 
indicate a tuning problem?  The machine has 64GB of RAM and as far as I can tell 
about 63GB is all cache.  I wonder if this is a clue to an undervalued 
memory-related setting somewhere?


-Dan

---(end of broadcast)---
TIP 6: explain analyze is your friend

[PERFORM] Simple query showing 270 hours of CPU time

2007-07-20 Thread Dan Harris

Today, I looked at 'top' on my PG server and saw a pid that reported 270 hours 
of CPU time.  Considering this is a very simple query, I was surprised to say 
the least.  I was about to just kill the pid, but I figured I'd try and see 
exactly what it was stuck doing for so long.


Here's the strace summary as run for a few second sample:

% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 97.250.671629  92  7272   semop
  1.760.012171 40630   recvfrom
  0.570.003960  6660   gettimeofday
  0.360.002512  2890   sendto
  0.050.000317  1032   lseek
  0.010.49   148   select
-- --- --- - - 
100.000.690638  7532   total

Here's the query:

select id from eventkeywords where word = '3322'

If I run the query manually, it completes in about 500ms, which is very 
reasonable.

There are 408563 rows in this table.  I just noticed there is no index on word ( 
there should be! ).  Would this have caused the problem?


This is 8.0.12

Linux sunrise 2.6.15-26-amd64-server #1 SMP Fri Sep 8 20:33:15 UTC 2006 x86_64 
GNU/Linux


Any idea what might have set it into this loop?

-Dan

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] importance of fast disks with pg

2007-07-17 Thread Dan Harris


Thomas Finneid wrote:

Hi

During the somes I did I noticed that it does not necessarily seem to be 
true that one needs the fastest disks to have a pg system that is fast.


It seems to me that its more important to:
- choose the correct methods to use for the operation
- tune the pg memory settings
- tune/disable pg xlog/wal etc

It also seems to me that fast disks are more important for db systems of 
the OLTP type applications with real concurrency of both readers and 
writes across many, possibly larger, tables etc.


Are the above statements close to having any truth in them?

regards

thomas


I'd say that "it depends".  We run an OLAP workload on 350+ gigs of database on 
a system with 64GB of RAM.  I can tell you for certain that fetching non-cached 
data is very sensitive to disk throughput!


Different types of workloads will find different bottlenecks in the system..

-Dan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] [ADMIN] reclaiming disk space after major updates

2007-06-08 Thread Dan Harris


Andrew Sullivan wrote:

On Thu, Jun 07, 2007 at 03:26:56PM -0600, Dan Harris wrote:
They don't always have to be in a single transaction, that's a good idea to 
break it up and vacuum in between, I'll consider that.  Thanks


If you can do it this way, it helps _a lot_.  I've had to do this
sort of thing, and breaking into groups of a couple thousand or so
really made the difference.

A



One more point in my original post.. For my own education, why does VACUUM FULL 
prevent reads to a table when running (I'm sure there's a good reason)?  I can 
certainly understand blocking writes, but if I could still read from it, I'd 
have no problems at all!


-Dan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Seq Scan

2007-06-01 Thread Dan Harris


Tyler Durden wrote:

Hi,
I'm having some problems in performance in a simple select count(id)
from I have 700 000 records in one table, and when I do:

# explain select (id) from table_name;
-[ RECORD 1 
]
QUERY PLAN | Seq Scan on table_name  (cost=0.00..8601.30 rows=266730 
width=4)


I had created an index for id(btree), but still shows "Seq Scan".
What I'm doing wrong?

Thanks,
Tyler

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


select count(*) will *always* do a sequential scan, due to the MVCC 
architecture.  See archives for much discussion about this.


-Dan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Background vacuum

2007-05-09 Thread Dan Harris


Daniel Haensse wrote:

Dear list,

I'm running postgres on a tomcat server. The vacuum is run every hour
(cronjob) which leads to a performance drop of the tomcat applications.
I played around with renice command and I think it is possible to reduce
this effect which a renice. The problem is how can I figure out the PID
of the postmaster performing the vacuum(automated)? Has anybody a nice
solution to change process priority? A shell script, maybe even for java?



While this may technically work, I think it lacks a key point.  'nice' ( at 
least the versions I'm familiar with ) do not adjust I/O priority.  VACUUM is 
bogging things down because of the extra strain on I/O.  CPU usage shouldn't 
really be much of a factor.


Instead, I would recommend looking at vacuum_cost_delay and the related settings 
to make vacuum lower priority than the queries you care about.  This should be a 
cleaner solution for you.


-Dan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM]

2007-05-08 Thread Dan Harris


Orhan Aglagul wrote:

Hi Everybody,

I was trying to see how many inserts per seconds my application could 
handle on various machines.



I read that postgres does have issues with MP Xeon (costly context 
switching). But I still think that with fsync=on 65 seconds is ridiculous.


CPU is unlikely your bottleneck..  You failed to mention anything about your I/O 
setup.  More details in this regard will net you better responses.  However, an 
archive search for insert performance will probably be worthwhile, since this 
type of question is repeated about once a month.





---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Feature Request --- was: PostgreSQL Performance Tuning

2007-04-27 Thread Dan Harris


Bill Moran wrote:

In response to Dan Harris <[EMAIL PROTECTED]>:


Why does the user need to manually track max_fsm_pages and max_fsm_relations?  I 
bet there are many users who have never taken the time to understand what this 
means and wondering why performance still stinks after vacuuming their database 
( spoken from my own experience )


But there are two distinct routes that can be taken if there's not enough
fsm space: add fsm space or vacuum more frequently.  I don't want the system
to eat up a bunch of memory for fsm entries if my workload indicates that
I can easily vacuum more frequently.


There's no magic bullet here, but heuristics should be able to tell us you can 
"easily vacuum more frequently"  And again, I said these things would be 
*optional*.  Like an item in postgresql.conf 
"i_have_read_the_manual_and_know_what_this_all_means = false #default false". 
If you change it to true, you have all the control you're used to and nothing 
will get in your way.





How about work_mem?  shared_buffers?  column statistics sizes? random_page_cost?


The only one that seems practical (to me) is random_page_cost.  The others are
all configuration options that I (as a DBA) want to be able to decide for
myself.  For example, I have some dedicated PG servers that I pretty much
max those values out at, to let PG know that it can use everything on the
system -- but I also have some shared use machines with PG, where I carefully
constrain those values so that PG doesn't muscle other daemons out of their
share of the RAM (work_mem is probably the best example)



Just because you carefully constrain it does not preclude the ability for 
program logic to maintain statistics to do what I suggested.



It would be nice to have some kind of utility that could tell me what
random_page_cost should be, as I've never felt comfortable tweaking it.
Like some utility to run that would say "based on the seek tests I just
ran, you should set random_page_cost to x".  Of course, if such a thing
existed, it could just fill in the value for you.  But I haven't figured
out how to pick a good value for that setting, so I have no idea how to
suggest to have it automatically set.


Me either, but I thought if there's a reason it's user-settable, there must be 
some demonstrable method for deciding what is best.




Couldn't some fairly simple regression tests akin to a VACUUM process spot 
potential problems?  "Hey, it looks like you need more fsm_relations.. I bumped 
that up automatically for you".  Or "These indexes look bloated, shall I 
automatically reindex them for you?"


A lot of that stuff does happen.  A vacuum verbose will tell you what it
thinks you should do, but I don't _want_ it to do it automatically.  What
if I create huge temporary tables once a week for some sort of analysis that
overload the fsm space?  And if I'm dropping those tables when the analysis
is done, do I want the fsm space constantly adjusting?


I understand *you* don't want it done automatically.  But my suspicion is that 
there are a lot more newbie pg admins who would rather let the system do 
something sensible as a default.  Again, you sound defensive that somehow my 
ideas would take power away from you.  I'm not sure why that is, but certainly 
I'm not suggesting that.  An auto-pilot mode is not a bad idea just because a 
few pilots don't want to use it.




Plus, some is just impossible.  shared_buffers requires a restart.  Do you
want your DB server spontaneously restarting because it thought more
buffers might be nice?


Well, maybe look at the bigger picture and see if it can be fixed to *not* 
require a program restart?  Or.. take effect on the next pid that gets created? 
 This is a current limitation, but doesn't need to be one for eternity does it?




I'm sure there are many more examples, that with some creative thinking, could 
be auto-adjusted to match the usage patterns of the database. PG does an 
excellent job of exposing the variables to the users, but mostly avoids telling 
the user what to do or doing it for them.  Instead, it is up to the user to know 
where to look, what to look for, and how to react to things to improve 
performance.  This is not all bad, but it is assuming that all users are hackers 
( which used to be true ), but certainly doesn't help when the average SQLServer 
admin tries out Postgres and then is surprised at the things they are now 
responsible for managing.  PG is certainly *not* the only database to suffer 
from this syndrome, I know..


I expect the suffering is a result of the fact that databases are non-trivial
pieces of software, and there's no universally simple way to set them up
and make them run well.


Speaking as a former SQL Server admin ( from day 1 of the Sybase fork up to 
version 2000 ), I can say there *is* a way to

Re: [PERFORM] Feature Request --- was: PostgreSQL Performance Tuning

2007-04-27 Thread Dan Harris


Michael Stone wrote:

On Fri, Apr 27, 2007 at 09:27:49AM -0400, Carlos Moreno wrote:
Notice that the second part of my suggestion covers this --- have 
additional

switches to initdb


If the person knows all that, why wouldn't they know to just change the 
config parameters?




Exactly..  What I think would be much more productive is to use the great amount 
of information that PG tracks internally and auto-tune the parameters based on 
it.  For instance:


Why does the user need to manually track max_fsm_pages and max_fsm_relations?  I 
bet there are many users who have never taken the time to understand what this 
means and wondering why performance still stinks after vacuuming their database 
( spoken from my own experience )


How about work_mem?  shared_buffers?  column statistics sizes? random_page_cost?

Couldn't some fairly simple regression tests akin to a VACUUM process spot 
potential problems?  "Hey, it looks like you need more fsm_relations.. I bumped 
that up automatically for you".  Or "These indexes look bloated, shall I 
automatically reindex them for you?"


I'm sure there are many more examples, that with some creative thinking, could 
be auto-adjusted to match the usage patterns of the database. PG does an 
excellent job of exposing the variables to the users, but mostly avoids telling 
the user what to do or doing it for them.  Instead, it is up to the user to know 
where to look, what to look for, and how to react to things to improve 
performance.  This is not all bad, but it is assuming that all users are hackers 
( which used to be true ), but certainly doesn't help when the average SQLServer 
admin tries out Postgres and then is surprised at the things they are now 
responsible for managing.  PG is certainly *not* the only database to suffer 
from this syndrome, I know..


I like to think of my systems as good employees.  I don't want to have to 
micromanage everything they do.  I want to tell them "here's what I want done", 
and assuming I made a good hiring choice, they will do it and take some liberty 
to adjust parameters where needed to achieve the spirit of the goal, rather than 
 blindly do something inefficiently because I failed to explain to them the 
absolute most efficient way to accomplish the task.


Granted, there are some people who don't like the developers making any 
assumptions about their workload.  But this doesn't have to be an either/or 
proposition.  I don't think any control needs to be abandoned.  But 
self-adjusting defaults seem like an achievable goal ( I know, I know, "show us 
the patch" ).  I just don't know if this feeling has resonated well between new 
users and long-term developers.  I know it must be grating to have to answer the 
same questions over and over and over "have you analyzed?  Did you leave 
postgresql.conf at the defaults??".  Seems like a win-win for both sides, IMHO.


In closing, I am not bashing PG!  I love it and swear by it.  These comments are 
purely from an advocacy perspective.  I'd love to see PG user base continue to grow.


My .02

-Dan



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[PERFORM] Finding bloated indexes?

2007-04-13 Thread Dan Harris

Is there a pg_stat_* table or the like that will show how bloated an index is? 
I am trying to squeeze some disk space and want to track down where the worst 
offenders are before performing a global REINDEX on all tables, as the database 
is rougly 400GB on disk and this takes a very long time to run.


I have been able to do this with tables, using a helpful view posted to this 
list a few months back, but I'm not sure if I can get the same results on indexes.


Thanks

-Dan

---(end of broadcast)---
TIP 6: explain analyze is your friend

[PERFORM] Planner doing seqscan before indexed join

2007-03-28 Thread Dan Harris


8.0.3 - Linux 2.6.18..
Freshly vacuumed and analyzed

This database has been humming along fine for a while now, but I've got one of 
those sticky queries that is taking
much too long to finish.

After some digging, I've found that the planner is choosing to apply a 
necessary seq scan to the table.  Unfortunately,
it's scanning the whole table, when it seems that it could have joined it to a 
smaller table first and reduce the
amount of rows it would have to scan dramatically ( 70 million to about 5,000 ).

The table "eventactivity" has about 70million rows in it, index on "incidentid"
The table "keyword_incidents" is a temporary table and has incidentid as its 
primary key.  It contains
5125 rows. eventmain and eventgeo both have 2.1 million. My hope is that I can 
convince the planner to do the
 join to keyword_incidents *first* and then do the seq scan for the LIKE condition.  Instead, it seems that it's seqscanning the 
whole 70 million rows first and then doing the join, which takes a lot longer than I'd like to wait for it.  Or, maybe I'm

misreading the explain output?

Thanks again

-Dan
-
Here's the query:

explain analyze 

select 
  *
from 

  keyword_incidents, 

  eventactivity, 

  eventmain, 

  eventgeo 

 where 

  eventmain.incidentid = keyword_incidents.incidentid and 

  eventgeo.incidentid = keyword_incidents.incidentid and 

  eventactivity.incidentid = keyword_incidents.incidentid 


  and (  recordtext like '%JOSE CHAVEZ%'   )
order by eventmain.entrydate limit 1;

---
 Limit  (cost=2388918.07..2388918.08 rows=1 width=455) (actual 
time=81771.186..81771.292 rows=26 loops=1)
   ->  Sort  (cost=2388918.07..2388918.08 rows=1 width=455) (actual 
time=81771.180..81771.215 rows=26 loops=1)
 Sort Key: eventmain.entrydate
 ->  Nested Loop  (cost=0.00..2388918.06 rows=1 width=455) (actual 
time=357.389..81770.982 rows=26 loops=1)
   ->  Nested Loop  (cost=0.00..2388913.27 rows=1 width=230) 
(actual time=357.292..81767.385 rows=26 loops=1)
 ->  Nested Loop  (cost=0.00..2388909.33 rows=1 width=122) 
(actual time=357.226..81764.501 rows=26 loops=1)
   ->  Seq Scan on eventactivity  (cost=0.00..2388874.46 rows=7 width=84) (actual time=357.147..81762.582 
rows=27 loops=1)

 Filter: ((recordtext)::text ~~ '%JOSE 
CHAVEZ%'::text)
   ->  Index Scan using keyword_incidentid_pkey on keyword_incidents  (cost=0.00..4.97 rows=1 width=38) 
(actual time=0.034..0.036 rows=1 loops=27)

 Index Cond: (("outer".incidentid)::text = 
(keyword_incidents.incidentid)::text)
 ->  Index Scan using eventgeo_incidentid_idx on eventgeo  (cost=0.00..3.93 rows=1 width=108) (actual 
time=0.076..0.081 rows=1 loops=26)

   Index Cond: (("outer".incidentid)::text = 
(eventgeo.incidentid)::text)
   ->  Index Scan using eventmain_incidentid_idx on eventmain  (cost=0.00..4.78 rows=1 width=225) (actual 
time=0.069..0.075 rows=1 loops=26)

 Index Cond: (("outer".incidentid)::text = 
(eventmain.incidentid)::text)
 Total runtime: 81771.529 ms
(15 rows)

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Determining server load from client

2007-03-20 Thread Dan Harris


Dan Harris wrote:
I've found that it would be helpful to be able to tell how busy my 
dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before 
pounding it with some OLAP-type queries. 

..snip

Thank you all for your great ideas!  I'm going to try the perl function 
as that seems like a very elegant way of doing it.


-Dan

---(end of broadcast)---
TIP 6: explain analyze is your friend

[PERFORM] Determining server load from client

2007-03-20 Thread Dan Harris

I've found that it would be helpful to be able to tell how busy my 
dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before 
pounding it with some OLAP-type queries.  Specifically, I have a 
multi-threaded client program that needs to run several thousand 
sequential queries.  I broke it into threads to take advantage of the 
multi-core architecture of the server hardware.  It would be very nice 
if I could check the load of the server at certain intervals to throttle 
the number of concurrent queries and mitigate load problems when other 
processes might be already inducing a significant load.


I have seen some other nice back-end things exposed through PG functions 
( e.g. database size on disk ) and wondered if there was anything 
applicable to this.  Even if it can't return the load average proper, is 
there anything else in the pg_* tables that might give me a clue how 
"busy" the server is for a period of time?


I've thought about allowing an ssh login without a keyphrase to log in 
and capture it.  But, the client process is running as an apache user. 
Giving the apache user a shell login to the DB box does not seem like a 
smart idea for obvious security reasons...


So far, that's all I can come up with, other than a dedicated socket 
server daemon on the DB machine to do it.


Any creative ideas are welcomed :)

Thanks

-Dan

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] General advice on user functions

2007-02-21 Thread Dan Harris


Thank you all for your ideas.  I appreciate the quick response.

-Dan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[PERFORM] General advice on user functions

2007-02-21 Thread Dan Harris

I have a new task of automating the export of a very complex Crystal 
Report.  One thing I have learned in the last 36 hours is that the 
export process to PDF is really, really, slooww..


Anyway, that is none of your concern.  But, I am thinking that I can 
somehow utilize some of PG's strengths to work around the bottleneck in 
Crystal.  The main problem seems to be that tens of thousands of rows of 
data must be summarized in the report and calculations made.  Based on 
my recent experience, I'd say that this task would be better suited to 
PG than relying on Crystal Reports to do the summarizing.


The difficulty I'm having is that the data needed is from about 50 
different "snapshots" of counts over time.  The queries are very simple, 
however I believe I am going to need to combine all of these queries 
into a single function that runs all 50 and then returns just the 
count(*) of each as a separate "column" in a single row.


I have been Googling for hours and reading about PL/pgsql functions in 
the PG docs and I have yet to find examples that returns multiple items 
in a single row.  I have seen cases that return "sets of", but that 
appears to be returning multiple rows, not columns.  Maybe this I'm 
barking up the wrong tree?


Here's the gist of what I need to do:

1) query count of rows that occurred between 14 months ago and 12 months 
ago for a given criteria, then count the rows that occurred between 2 
months ago and current.  Repeat for 50 different where clauses.


2) return each count(*) as a "column" so that in the end I can say:

select count_everything( ending_date );

and have it return to me:

count_a_lastyear   count_a_last60count_b_lastyearcount_b_last60
   ----
 100150   200 250

I'm not even sure if a function is what I'm after, maybe this can be 
done in a view?  I am embarrassed to ask something that seems like it 
should be easy, but some key piece of knowledge is escaping me on this.


I don't expect someone to write this for me, I just need a nudge in the 
right direction and maybe a URL or two to get me started.


Thank you for reading this far.

-Dan

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] quad or dual core Intel CPUs

2007-02-13 Thread Dan Harris


Arjen van der Meijden wrote:



But be aware that there can be substantial and unexpected differences on 
this relatively new platform due to simply changing the OS, like we saw 
when going from linux 2.6.15 to 2.6.18, as you can see here:

http://tweakers.net/reviews/657/2



Having upgraded to 2.6.18 fairly recently, I am *very* interested in 
what caused the throughput to drop in 2.6.18?  I haven't done any 
benchmarking on my system to know if it affected my usage pattern 
negatively, but I am curious if anyone knows why this happened?


-Dan

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM]Is it possible to start two instances of postgresql?

2006-06-15 Thread Dan Harris




[EMAIL PROTECTED] wrote:

  

both of the two database are live but use for two different web app.
my company don't want to spend more to buy a new server, so then I think of
to implement both under the same server and one instance..


Just as an anecdote, I am running 30 databases on a single instance and 
it's working quite well.   There may be reasons to run multiple 
instances but it seems like tuning them to cooperate for memory would 
pose some problems - e.g. effective_cache_size.


-Dan


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Killing long-running queries

2006-05-02 Thread Dan Harris


Tom Lane wrote

You should be using SIGINT, not SIGTERM.

regards, tom lane
  


Thank you very much for clarifying this point!  It works :)



---(end of broadcast)---
TIP 6: explain analyze is your friend

[PERFORM] Killing long-running queries

2006-05-02 Thread Dan Harris

My database is used primarily in an OLAP-type environment.  Sometimes my 
users get a little carried away and find some way to slip past the 
sanity filters in the applications and end up bogging down the server 
with queries that run for hours and hours.  And, of course, what users 
tend to do is to keep queuing up more queries when they don't see the 
first one return instantly :)


So, I have been searching for a way to kill an individual query.  I read 
in the mailing list archives that you could 'kill' the pid.  I've tried 
this a few times and more than once, it has caused the postmaster to 
die(!), terminating every query that was in process, even unrelated to 
that query. 

Is there some way I can just kill a query and not risk breaking 
everything else when I do it?


Thanks


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Encouraging multi-table join order

2006-04-11 Thread Dan Harris


Tom Lane wrote:

What the stats entry is saying is that the most common entries occur
about 75000 times apiece (0.0017 * 45e6), which is what's scaring
the planner here ;-).  I think those frequencies are artificially high
though.  The default statistics sample size is 3000 rows (300 *
statistics target, actually), so those numbers correspond to 5 or 4
rows in the sample, which is probably just random chance.

Try increasing the stats targets for this table to 100, then re-ANALYZE
and see what you get.  The most_common_freqs entries might drop as much
as a factor of 10.

regards, tom lane
  


Tom:

I believe this was the problem.  I upped the statistics to 100, for a 
sample size of 30k and now the planner does the correct nested 
loop/index scan and takes only 30 seconds!  This is a HUGE performance 
increase.


I wonder why the estimates were so far off the first time?  This table 
has been ANALYZED regularly ever since creation.


Once again, thank you and all of the developers for your hard work on 
PostgreSQL.  This is by far the most pleasant management experience of 
any database I've worked on.


-Dan


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Encouraging multi-table join order

2006-04-11 Thread Dan Harris


Tom Lane wrote:


So it's estimating 5775 cost units per probe into eventactivity, which
is pretty high --- it must think that a lot of rows will be retrieved by
the index (way more than the 20 or so it thinks will get past the filter
condition). 



 What does the pg_stats entry for eventactivity.incidentid
contain?
select * from pg_stats where tablename = 'eventactivity' and 
attname='incidentid';
schemaname |   tablename   |  attname   | null_frac | avg_width | 
n_distinct |
most_common_vals 
|
most_common_freqs
|  
histogram_bounds  | 
correlation

+---++---+---++-+-++-
public | eventactivity | incidentid | 0 |14 |   
8157 | 
{P043190299,P051560740,P052581036,P052830218,P053100679,P053190889,P060370845,P042070391,P042690319,P043290117} 
| 
{0.0017,0.0017,0.0017,0.0017,0.0017,0.0017,0.0017,0.0013,0.0013,0.0013} 
| 
{P022140319,P030471058,P033090308,P041961082,P042910689,P050311006,P051350254,P052261148,P053270945,P060240316,P061000287} 
|0.241737



  It might be worth increasing the statistics target for that
column to try to get a better estimate.
  
How high should I set this?  I read the default is 10, but I'm not sure 
if doubling this would make a difference or if I should be doing a much 
larger number. There's approx 45 million rows in the table, if that matters.



Thanks again,
Dan

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Encouraging multi-table join order

2006-04-10 Thread Dan Harris


Tom Lane wrote:

That's very strange --- the estimated cost of the seqscan is high enough
that the planner should have chosen a nestloop with inner indexscan on
the big table.  I'm not sure about the join-order point, but the hash
plan for the first join seems wrong in any case.

Um, you do have an index on eventactivity.incidentid, right?  What's the
datatype(s) of the incidentid columns?  What happens to the plan if you
turn off enable_hashjoin and enable_mergejoin?

regards, tom lane
  
Yes, eventactivity.incidentid is indexed.  The datatype is varchar(40).  
Although, by checking this, I noticed that k_h.incidentid was 
varchar(100).  Perhaps the difference in length between the keys caused 
the planner to not use the fastest method?  I have no defense as to why 
those aren't the same.. I will make them so and check.


Here's the EXPLAIN analyze with enable_hashjoin = off and 
enable_mergejoin = off :


Limit  (cost=4226535.73..4226544.46 rows=698 width=82) (actual 
time=74339.016..74356.521 rows=888 loops=1)
  ->  Unique  (cost=4226535.73..4226544.46 rows=698 width=82) (actual 
time=74339.011..74354.073 rows=888 loops=1)
->  Sort  (cost=4226535.73..4226537.48 rows=698 width=82) 
(actual time=74339.003..74344.031 rows=3599 loops=1)
  Sort Key: eventmain.entrydate, eventmain.incidentid, 
eventgeo.eventlocation, eventactivity.recordtext
  ->  Nested Loop  (cost=0.00..4226502.76 rows=698 
width=82) (actual time=921.325..74314.959 rows=3599 loops=1)
->  Nested Loop  (cost=0.00..4935.61 rows=731 
width=72) (actual time=166.354..14638.308 rows=1162 loops=1)
  ->  Nested Loop  (cost=0.00..2482.47 rows=741 
width=50) (actual time=150.396..7348.013 rows=1162 loops=1)
->  Index Scan using k_h_id_idx on k_h  
(cost=0.00..217.55 rows=741 width=14) (actual time=129.540..1022.243 
rows=1162 loops=1)

  Index Cond: (id = 33396)
  Filter: ((entrydate >= 
'2006-01-01 00:00:00'::timestamp without time zone) AND (entrydate < 
'2006-04-08 00:00:00'::timestamp without time zone))
->  Index Scan using 
eventgeo_incidentid_idx on eventgeo  (cost=0.00..3.04 rows=1 width=36) 
(actual time=5.260..5.429 rows=1 loops=1162)
  Index Cond: 
((eventgeo.incidentid)::text = ("outer".incidentid)::text)
  ->  Index Scan using eventmain_incidentid_idx 
on eventmain  (cost=0.00..3.30 rows=1 width=22) (actual 
time=5.976..6.259 rows=1 loops=1162)
Index Cond: 
((eventmain.incidentid)::text = ("outer".incidentid)::text)
->  Index Scan using eventactivity1 on 
eventactivity  (cost=0.00..5774.81 rows=20 width=52) (actual 
time=29.768..51.334 rows=3 loops=1162)
  Index Cond: (("outer".incidentid)::text = 
(eventactivity.incidentid)::text)
  Filter: ' '::text || (recordtext)::text) 
|| ' '::text) ~~ '%HAL%'::text) AND (entrydate >= '2006-01-01 
00:00:00'::timestamp without time zone) AND (entrydate < '2006-04-08 
00:00:00'::timestamp without time zone))




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[PERFORM] Encouraging multi-table join order

2006-04-10 Thread Dan Harris

I have a query that is intended to select from multiple "small tables" 
to get a limited subset of "incidentid" and then join with a "very 
large" table.  One of the operations will require a sequential scan, but 
the planner is doing the scan on the very large table before joining the 
small ones, resulting in a huge amount of disk I/O.  How would I make 
this query join the large table only after narrowing down the possible 
selections from the smaller tables?  This is running on version 8.0.3.


Thanks for any ideas.

-Dan


QUERY

explain analyze
   select distinct
eventmain.incidentid,
   eventmain.entrydate,
   eventgeo.eventlocation,
recordtext as retdata
from
eventactivity
join (
select
incidentid
 from
k_h
where
id = 33396 and
k_h.entrydate >= '2006-1-1 00:00' and
k_h.entrydate < '2006-4-8 00:00'
) id_keywords using ( incidentid ) ,
  
eventmain,

eventgeo
 where
eventmain.incidentid = eventactivity.incidentid and
eventmain.incidentid = eventgeo.incidentid and
( ' ' || recordtext || ' ' like '%HAL%' ) and
eventactivity.entrydate >= '2006-1-1 00:00' and
eventactivity.entrydate < '2006-4-8 00:00'
order by
eventmain.entrydate limit 1;



EXPLAIN ANALYZE OUTPUT

Limit  (cost=2521191.65..2521191.90 rows=6 width=187) (actual 
time=1360935.787..1361072.277 rows=1400 loops=1)
  ->  Unique  (cost=2521191.65..2521191.90 rows=6 width=187) (actual 
time=1360935.779..1361067.853 rows=1400 loops=1)
->  Sort  (cost=2521191.65..2521191.66 rows=6 width=187) 
(actual time=1360935.765..1360958.258 rows=16211 loops=1)
  Sort Key: eventmain.entrydate, eventmain.incidentid, 
eventactivity.recordtext, eventgeo.eventlocation
  ->  Nested Loop  (cost=219.39..2521191.57 rows=6 
width=187) (actual time=1123.115..1360579.798 rows=16211 loops=1)
->  Nested Loop  (cost=219.39..2521173.23 rows=6 
width=154) (actual time=1105.773..1325907.716 rows=16211 loops=1)
  ->  Hash Join  (cost=219.39..2521153.37 
rows=6 width=66) (actual time=1069.476..1289608.261 rows=16211 loops=1)
Hash Cond: (("outer".incidentid)::text 
= ("inner".incidentid)::text)
->  Seq Scan on eventactivity  
(cost=0.00..2518092.06 rows=1532 width=52) (actual 
time=57.205..1288514.530 rows=2621 loops=1)
  Filter: ' '::text || 
(recordtext)::text) || ' '::text) ~~ '%HAL%'::text) AND (entrydate >= 
'2006-01-01 00:00:00'::timestamp without time zone) AND (entrydate < 
'2006-04-08 00:00:00'::timestamp without time zone))
->  Hash  (cost=217.53..217.53 rows=741 
width=14) (actual time=899.128..899.128 rows=0 loops=1)
  ->  Index Scan using k_h_id_idx 
on k_h  (cost=0.00..217.53 rows=741 width=14) (actual 
time=55.097..893.883 rows=1162 loops=1)

Index Cond: (id = 33396)
Filter: ((entrydate >= 
'2006-01-01 00:00:00'::timestamp without time zone) AND (entrydate < 
'2006-04-08 00:00:00'::timestamp without time zone))
  ->  Index Scan using eventmain_incidentid_idx 
on eventmain  (cost=0.00..3.30 rows=1 width=88) (actual 
time=1.866..2.227 rows=1 loops=16211)
Index Cond: 
((eventmain.incidentid)::text = ("outer".incidentid)::text)
->  Index Scan using eventgeo_incidentid_idx on 
eventgeo  (cost=0.00..3.04 rows=1 width=75) (actual time=1.770..2.126 
rows=1 loops=16211)
  Index Cond: ((eventgeo.incidentid)::text = 
("outer".incidentid)::text)

Total runtime: 1361080.787 ms
(19 rows)


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

[PERFORM] vacuum full seems to hang on very small table

2006-04-04 Thread Dan Harris

I have a table with 1 live row that I found has 115000 dead rows in it ( 
from a testing run ).  I'm trying to VACUUM FULL the table and it has 
run for over 18 hours without completion.  Considering the hardware on 
this box and the fact that performance seems reasonable in all other 
aspects, I'm confused as to why this would happen.  The database other 
than this table is quite large ( 70 gigs on disk ) and I would expect to 
take days to complete but I just did 'vacuum full table_stats'.  That 
should only do that table, correct?  I'm running 8.0.3.


 Table "public.table_stats"
  Column|Type | Modifiers
-+-+---
count_cfs   | integer |
count_ncfs  | integer |
count_unitactivity  | integer |
count_eventactivity | integer |
min_eventmain   | timestamp without time zone |
max_eventmain   | timestamp without time zone |
min_eventactivity   | timestamp without time zone |
max_eventactivity   | timestamp without time zone |
geocoding_hitrate   | double precision|
recent_load | timestamp without time zone |
count_eventmain | integer |


This is the table structure.

Any ideas where to begin troubleshooting this?

Thanks.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Help optimizing a slow index scan

2006-03-17 Thread Dan Harris


Merlin Moncure wrote:


As others will probably mention, effective queries on lot/long which
is a spatial problem will require r-tree or gist.  I don't have a lot
of experience with exotic indexes but this may be the way to go.

One easy optimization to consider making is to make an index on either
(incidentid, entrydate) or (incident_id,long) which ever is more
selective.

This is 'yet another query' that would be fun to try out and tweak
using the 8.2 upcoming row-wise comparison.

merlin
  
Thanks to everyone for your suggestions.  One problem I ran into is that 
apparently my version doesn't support the GIST index that was 
mentioned.  "function 'box' doesn't exist" ).. So I'm guessing that both 
this as well as the Earth Distance contrib require me to add on some 
more pieces that aren't there.


Furthermore, by doing so, I am tying my queries directly to 
"postgres-isms".  One of the long term goals of this project is to be 
able to fairly transparently support any ANSI SQL-compliant back end 
with the same code base.  If I had full control over the query designs, 
I could make stored procedures to abstract this.  However, I have to 
deal with a "gray box" third-party reporting library that isn't so 
flexible.  I'll certainly consider going with something 
postgre-specific, but only as a last resort.


I tried the multi-column index as mentioned above but didn't see any 
noticeable improvement in elapsed time, although the planner did use the 
new index.


What is the real reason for the index not being very effective on these 
columns?  Although the numbers are in a very limited range, it seems 
that the records would be very selective as it's not terribly common for 
multiple rows to share the same coords.


Is the "8.2. upcoming row-wise comparison" something that would be 
likely to help me?


Thanks again for your input

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Help optimizing a slow index scan

2006-03-17 Thread Dan Harris


Dan Harris wrote:

Markus Bertheau wrote:

Have you tried using a GIST index on lat & long? These things are
meant for two-dimensional data, whereas btree doesn't handle
two-dimensional data that well. How many rows satisfy either of the
long / lat condition?

 

According to the analyze, less than 500 rows matched.  I'll look into 
GIST indexes, thanks for the feedback.


-Dan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


When I try to create a GIST index, I get the following error:

create index eventgeo_lat_idx on eventgeo using GIST (lat);

ERROR:  data type double precision has no default operator class for 
access method "gist"
HINT:  You must specify an operator class for the index or define a 
default operator class for the data type.


I'm not sure what a "default operator class" is, exactly..

-Dan

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Help optimizing a slow index scan

2006-03-17 Thread Dan Harris


Markus Bertheau wrote:

Have you tried using a GIST index on lat & long? These things are
meant for two-dimensional data, whereas btree doesn't handle
two-dimensional data that well. How many rows satisfy either of the
long / lat condition?

  

According to the analyze, less than 500 rows matched.  I'll look into 
GIST indexes, thanks for the feedback.


-Dan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Help optimizing a slow index scan

2006-03-16 Thread Dan Harris


Dan Harris wrote:
explain analyze 
 doh.. sorry to reply to my own post.  But I messed up copying some 
of the fields into the select statement that you'll see in the "Sort 
Key" section of the analyze results.  The mistake was mine.  Everything 
else is "normal" between the query and the plan.


-Dan

---(end of broadcast)---
TIP 6: explain analyze is your friend

[PERFORM] Help optimizing a slow index scan

2006-03-16 Thread Dan Harris


explain analyze
select distinct eventmain.incidentid, eventmain.entrydate, 
eventgeo.long, eventgeo.lat, eventgeo.geox, eventgeo.geoy

from eventmain, eventgeo
where
   eventmain.incidentid = eventgeo.incidentid and
   ( long > -104.998027962962 and long < -104.985957781349 ) and
   ( lat > 39.7075542720006 and lat < 39.7186195832938 ) and
   eventmain.entrydate > '2006-1-1 00:00' and
   eventmain.entrydate <= '2006-3-17 00:00'
order by
   eventmain.entrydate;

  QUERY 
PLAN  


-
Unique  (cost=121313.81..121330.72 rows=451 width=178) (actual 
time=723719.761..723726.875 rows=1408 loops=1)
  ->  Sort  (cost=121313.81..121314.94 rows=451 width=178) (actual 
time=723719.755..723721.807 rows=1408 loops=1)
Sort Key: eventmain.entrydate, eventmain.disposition, 
eventmain.incidentid, eventgeo.reportingarea, eventgeo.beatid, 
eventmain.finaltype, eventmain.casenumber, eventgeo.eventlocation, 
eventmain.insertdate, eventmain.priority, eventgeo.long, eventgeo.lat, 
eventgeo.geox, eventgeo.geoy
->  Nested Loop  (cost=0.00..121293.93 rows=451 width=178) 
(actual time=1916.230..723712.900 rows=1408 loops=1)
  ->  Index Scan using eventgeo_lat_idx on eventgeo  
(cost=0.00..85488.05 rows=10149 width=76) (actual time=0.402..393376.129 
rows=22937 loops=1)
Index Cond: ((lat > 39.7075542720006::double 
precision) AND (lat < 39.7186195832938::double precision))
Filter: ((long > -104.998027962962::double 
precision) AND (long < -104.985957781349::double precision))
  ->  Index Scan using eventmain_incidentid_idx on 
eventmain  (cost=0.00..3.52 rows=1 width=119) (actual 
time=14.384..14.392 rows=0 loops=22937)
Index Cond: ((eventmain.incidentid)::text = 
("outer".incidentid)::text)
Filter: ((entrydate > '2006-01-01 
00:00:00'::timestamp without time zone) AND (entrydate <= '2006-03-17 
00:00:00'::timestamp without time zone))


Total runtime:  >>> 723729.238 ms(!) <<<



I'm trying to figure out why it's consuming so much time on the index 
scan for eventgeo_lat_idx.  Also, I have an index on "long" that the 
planner does not appear to find helpful.


There are 3.3 million records in eventmain and eventgeo.  The server has 
a reasonably fast RAID10 setup with 16x 15k RPM drives and 12GB of RAM ( 
11GB listed as "cache" by vmstat ).  Running version 8.0.2 on linux 
kernel 2.6.12.


I have just vacuum analyze'd both tables, rebuilt the eventgeo_lat_idx 
index and reran the query multiple times to see if caching helped ( it 
didn't help much ).   The server seems to be fine utilizing other fields 
from this table but using "long" and "lat" seem to drag it down 
significantly.


 Is it because there's such slight differences between the records, 
since they are all within a few hundredths of a degree from each other?


Thanks for your time and ideas.

-Dan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Ultra-cheap NVRAM device

2005-10-03 Thread Dan Harris



On Oct 3, 2005, at 5:02 AM, Steinar H. Gunderson wrote:

I thought this might be interesting, not the least due to the  
extremely low

price ($150 + the price of regular DIMMs):




Replying before my other post came through.. It looks like their  
benchmarks are markedly improved since the last article I read on  
this.  There may be more interest now..


-Dan


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Ultra-cheap NVRAM device

2005-10-03 Thread Dan Harris



On Oct 3, 2005, at 5:02 AM, Steinar H. Gunderson wrote:

I thought this might be interesting, not the least due to the  
extremely low

price ($150 + the price of regular DIMMs):





This has been posted before, and the main reason nobody got very  
excited is that:


a) it only uses the PCI bus to provide power to the device, not for I/O
b) It is limited to SATA bandwidth
c) The benchmarks did not prove it to be noticeably faster than a  
good single SATA drive


A few of us were really excited at first too, until seeing the  
benchmarks..


-Dan



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Monitoring Postgresql performance

2005-09-28 Thread Dan Harris



On Sep 28, 2005, at 8:32 AM, Arnau wrote:


Hi all,

  I have been "googling" a bit searching info about a way to  
monitor postgresql (CPU & Memory, num processes, ... )


You didn't mention your platform, but I have an xterm open pretty  
much continuously for my DB server that runs plain old top.  I have  
customized my settings enough that I can pretty much see anything I  
need to from there.


-Dan


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Poor performance on HP Package Cluster

2005-09-01 Thread Dan Harris

Do you have any sources for that information?  I am running dual  
SmartArray 6402's in my DL585 and haven't noticed anything poor about  
their performance.


On Sep 1, 2005, at 2:24 PM, Luke Lonergan wrote:

Are you using the built-in HP SmartArray RAID/SCSI controllers?  If  
so, that

could be your problem, they are known to have terrible and variable
performance with Linux.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] extremly low memory usage

2005-08-20 Thread Dan Harris



On Aug 19, 2005, at 3:01 PM, Jeremiah Jahn wrote:

Rebuild in progress with just ext3 on the raid array...will see if  
this

helps the access times.


From my recent experiences, I can say ext3 is probably not a great  
choice for Pg databases.  If you check the archives you'll see  
there's a lot of discussion about various journalling filesystems and  
ext3 usually(always?) comes up on the bottom as far as performance  
goes.  If you insist on using it, I would at least recommend the  
noatime option in fstab and using data=writeback to get the faster of  
the journal modes.


XFS seems to be a trusted choice, followed by Reiser and JFS both  
with the occasional controversy when the comparisons pop up.


-Dan


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Query plan looks OK, but slow I/O - settings advice?

2005-08-19 Thread Dan Harris



On Aug 19, 2005, at 12:55 AM, Jeffrey W. Baker wrote:


On Tue, 2005-08-16 at 10:46 -0700, Roger Hand wrote:
  Have you considered booting your
machine with elevator=deadline?


Although I'm not the OP for this problem, I thought I'd try it out.   
WOW.. this should be in a Pg tuning guide somewhere. I added this to  
my server tonight just for kicks and saw a pronounced improvement in  
IO performance.  Thank you very much for mentioning this on the list.


I didn't have a long enough maintenance window to do solid  
benchmarking, but I can say for certain that the change was  
noticeable, especially in VACUUM operations.


Specs for the server:

PG 8.0.1
Linux 2.6.12-3 kernel
4xOpteron 2.2
12GB RAM
16-drive RAID 10
XFS mounted with noatime
pg_xlog on separate RAID controller

-Dan


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Speedier count(*)

2005-08-11 Thread Dan Harris


Thanks for all the great ideas.  I have more options to evaluate now.

-Dan


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

[PERFORM] Speedier count(*)

2005-08-10 Thread Dan Harris

I have a web page for my customers that shows them count of records  
and some min/max date ranges in each table of a database, as this is  
how we bill them for service.  They can log in and check the counts  
at any time.  I'd like for the counts to be as fresh as possible by  
keeping this dynamic, but I will use a periodic 'snapshot'/cron job  
if that is the only option to speed this up.   I have thought about  
using the table statistics, but the estimate error is probably  
unacceptable because of the billing purposes.


For some reason, the SQL Server we migrated the app from can return  
count(*) in a split second on multi-million row tables, even though  
it is a MUCH slower box hardware-wise, but it's now taking many  
seconds to run. I have read in the archives the problems MVCC brings  
into the count(*) dilemma forcing Pg to run a seq scan to get  
counts.  Does SQLServer not use MVCC or have they found another  
approach for arriving at this number?  Compounding all the min/max  
and counts from other tables and all those queries take about a  
minute to run. The tables will contain anywhere from 1 million to 40  
million rows.


Also, I am using "select ... group by ... order by .. limit 1" to get  
the min/max since I have already been bit by the issue of min() max()  
being slower.



-Dan



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Table locking problems?

2005-08-09 Thread Dan Harris



On Aug 9, 2005, at 3:51 PM, John A Meinel wrote:


Dan Harris wrote:


On Aug 10, 2005, at 12:49 AM, Steve Poe wrote:


Dan,

Do you mean you did RAID 1 + 0 (RAID 10) or RAID 0 + 1? Just a
clarification, since RAID 0 is still a single-point of failure  
even if

RAID1 is on top of RAID0.

Well, you tell me if I stated incorrectly.  There are two raid   
enclosures with 7 drives in each.  Each is on its own bus on a  
dual- channel controller.  Each box has a stripe across its drives  
and the  enclosures are mirrors of each other.  I understand the  
controller  could be a single point of failure, but I'm not sure I  
understand  your concern about the RAID structure itself.




In this configuration, if you have a drive fail on both  
controllers, the entire RAID dies. Lets label them A1-7, B1-7,  
because you stripe within a set, if a single one of A dies, and a  
single one of B dies, you have lost your entire mirror.


The correct way of doing it, is to have A1 be a mirror of B1, and  
then stripe above that. Since you are using 2 7-disk enclosures,  
I'm not sure how you can do it well, since it is not an even number  
of disks. Though if you are using software RAID, there should be no  
problem.


The difference is that in this scenario, *all* of the A drives can  
die, and you haven't lost any data. The only thing you can't lose  
is a matched pair (eg losing both A1 and B1 will cause complete  
data loss)


I believe the correct notation for this last form is RAID 1 + 0  
(RAID10) since you have a set of RAID1 drives, with a RAID0 on-top  
of them.




I have read up on the difference now. I don't understand why it's a  
"single point of failure".  Technically any array could be a "single  
point" depending on your level of abstraction.   In retrospect, I  
probably should have gone 8 drives in each and used RAID 10 instead  
for the better fault-tolerance,  but it's online now and will require  
some planning to see if I want to reconfigure that in the future.  I  
wish HP's engineer would have promoted that method instead of 0+1..


-Dan


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Table locking problems?

2005-08-09 Thread Dan Harris



On Aug 10, 2005, at 12:49 AM, Steve Poe wrote:


Dan,

Do you mean you did RAID 1 + 0 (RAID 10) or RAID 0 + 1? Just a
clarification, since RAID 0 is still a single-point of failure even if
RAID1 is on top of RAID0.


Well, you tell me if I stated incorrectly.  There are two raid  
enclosures with 7 drives in each.  Each is on its own bus on a dual- 
channel controller.  Each box has a stripe across its drives and the  
enclosures are mirrors of each other.  I understand the controller  
could be a single point of failure, but I'm not sure I understand  
your concern about the RAID structure itself.




How many users are connected when your update / delete queries are
hanging? Have you done an analyze verbose on those queries?


Most of the traffic is from programs we run to do analysis of the  
data and managing changes.  At the time I noticed it this morning,  
there were 10 connections open to the database.  That rarely goes  
above 20 concurrent.  As I said in my other response, I believe that  
the log will only contain the query at the point the query finishes,  
so if it never finishes...




Have you made changes to the postgresql.conf? kernel.vm settings? IO
scheduler?


I set shmmax appropriately for my shared_buffers setting, but that's  
the only kernel tweak.




If you're not doing so already, you may consider running sar  
(iostat) to

monitor when the hanging occurs if their is a memory / IO bottleneck
somewhere.



I will try that.  Thanks



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Table locking problems?

2005-08-09 Thread Dan Harris



On Aug 9, 2005, at 1:08 PM, Tom Lane wrote:


"Joshua D. Drake" <[EMAIL PROTECTED]> writes:


My experience is that when this type of thing happens it is typically
specific queries that cause the problem. If you turn on statement
logging you can get the exact queries and debug from there.





Here are some things to look for:





Is it a large table (and thus large indexes) that it is updating?
Is the query using indexes?
Is the query modifying ALOT of rows?



Another thing to look at is foreign keys.  Dan could be running into
problems with an update on one side of an FK being blocked by locks
on the associated rows on the other side.

regards, tom lane



Tom, Steve, Josh:

Thank you for your ideas.  The updates are only on a single table, no  
joins.  I had stats collection turned off.  I have turned that on  
again so that I can try and catch one while the problem is  
occurring.  The last table it did this on was about 3 million  
records.  4 single-column indexes on it.


The problem I had with statement logging is that if the query never  
finishes, it doesn't get logged as far as I can tell.  So everything  
that did get logged was normal and would run with no isses in psql by  
copy and pasting it.  The rows updated will certainly vary by query.   
I really need to "catch it in the act" with stats collection on so I  
can get the query from pg_stat_activity.  Once I get it, I will play  
with explains and see if I can reproduce it outside the wild.


Thanks again for your help.

-Dan


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

[PERFORM] Table locking problems?

2005-08-09 Thread Dan Harris

I thought I would send this to pg-performance since so many people  
helped me with my speed issues recently.  I was definitely IO- 
bottlenecked.


Since then, I have installed 2 RAID arrays with 7 15k drives in them  
in RAID 0+1 as well as add a new controller card with 512MB of cache  
on it.  I also created this new partition on the RAID as XFS instead  
of ext3.


These changes have definitely improved performance, but I am now  
finding some trouble with UPDATE or DELETE queries "hanging" and  
never releasing their locks.  As this happens, other statements queue  
up behind it.  It seems to occur at times of very high loads on the  
box.  Is my only option to kill the query ( which usually takes down  
the whole postmaster with it! ouch ).


Could these locking issues be related to the other changes I made?  
I'm really scared that this is related to choosing XFS, but I sure  
hope not.   How should I go about troubleshooting the "problem"  
queries?  They don't seem to be specific to a single table or single  
database.


I'm running 8.0.1 on kernel 2.6.12-3 on 64-bit Opterons if that  
matters..



-Dan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Fwd: Help with view performance problem

2005-07-28 Thread Dan Harris



On Jul 28, 2005, at 8:38 AM, Chris Hoover wrote:



I did some more testing, and ran the explain analyze on the problem.
In my session I did a set enable_hashjoin = false and then ran the
analyze.  This caused it to use the indexes as I have been expecting
it to do.

Now, how can I get it to use the indexes w/o manipulating the
environment?  What make postgresql want to sequentially scan and use a
hash join?

thanks,

Chris

explain analyze with set_hashjoin=false;
prob_db=#explain analyze select * from clm_com;




I had something similar to this happen recently.  The planner was  
choosing a merge join and seq scan because my 'random_page_cost' was  
set too high.  I had it at 3 , and ended up settling at 1.8 to get it  
to correctly use my indices.  Once that change was in place, the  
planner did the 'right' thing for me.


Not sure if this will help you, but it sounds similar.

-Dan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[PERFORM] faster INSERT with possible pre-existing row?

2005-07-26 Thread Dan Harris

I am working on a process that will be inserting tens of million rows  
and need this to be as quick as possible.


The catch is that for each row I could potentially insert, I need to  
look and see if the relationship is already there  to prevent  
multiple entries.  Currently I am doing a SELECT before doing the  
INSERT, but I recognize the speed penalty in doing to operations.  I  
wonder if there is some way I can say "insert this record, only if it  
doesn't exist already".  To see if it exists, I would need to compare  
3 fields instead of just enforcing a primary key.


Even if this could be a small increase per record, even a few percent  
faster compounded over the whole load could be a significant reduction.


Thanks for any ideas you might have.

-Dan

---(end of broadcast)---
TIP 6: explain analyze is your friend

[PERFORM] Coraid/AoE device experience?

2005-07-25 Thread Dan Harris

Lately, I've been reading a lot about these new Coraid AoE RAID  
devices ( http://www.coraid.com ).  They tout it as being fast and  
cheap and better than iSCSI due to the lack of TCP/IP over the wire.   
Is it likely that a 15-drive RAID 10 Linux software RAID would  
outperform a 4-drive 10k SCSI RAID 0+1 for a heavy-loaded database?   
If not software RAID, how about their dedicated RAID controller blade?


I'm definitely IO bound right now and starving for spindles.  Does  
this make sense or is it too good to be true?


Thanks
-Dan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Really bad diskio

2005-07-15 Thread Dan Harris



On Jul 15, 2005, at 2:39 PM, Ron Wills wrote:


Hello all

  I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and
an 3Ware SATA raid.


Operating System?  Which file system are you using?  I was having a  
similar problem just a few days ago and learned that ext3 was the  
culprit.


-Dan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] slow joining very large table to smaller ones

2005-07-15 Thread Dan Harris



On Jul 15, 2005, at 9:09 AM, Dan Harris wrote:



On Jul 14, 2005, at 10:12 PM, John A Meinel wrote:



My biggest question is why the planner things the Nested Loop  
would be

so expensive.
Have you tuned any of the parameters? It seems like something is  
out of

whack. (cpu_tuple_cost, random_page_cost, etc...)




here's some of my postgresql.conf.  Feel free to blast me if I did  
something idiotic here.


shared_buffers = 5
effective_cache_size = 1348000
random_page_cost = 3
work_mem = 512000
max_fsm_pages = 8
log_min_duration_statement = 6
fsync = true ( not sure if I'm daring enough to run without this )
wal_buffers = 1000
checkpoint_segments = 64
checkpoint_timeout = 3000


# FOR PG_AUTOVACUUM --#
stats_command_string = true
stats_row_level = true



Sorry, I forgot to re-post my hardware specs.

HP DL585
4 x 2.2 GHz Opteron
12GB RAM
SmartArray RAID controller, 1GB hardware cache, 4x73GB 10k SCSI in  
RAID 0+1

ext2 filesystem

Also, there are 30 databases on the machine, 27 of them are identical  
schemas. 
 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] slow joining very large table to smaller ones

2005-07-15 Thread Dan Harris



On Jul 14, 2005, at 10:12 PM, John A Meinel wrote:


My biggest question is why the planner things the Nested Loop would be
so expensive.
Have you tuned any of the parameters? It seems like something is  
out of

whack. (cpu_tuple_cost, random_page_cost, etc...)



here's some of my postgresql.conf.  Feel free to blast me if I did  
something idiotic here.


shared_buffers = 5
effective_cache_size = 1348000
random_page_cost = 3
work_mem = 512000
max_fsm_pages = 8
log_min_duration_statement = 6
fsync = true ( not sure if I'm daring enough to run without this )
wal_buffers = 1000
checkpoint_segments = 64
checkpoint_timeout = 3000


# FOR PG_AUTOVACUUM --#
stats_command_string = true
stats_row_level = true


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] slow joining very large table to smaller ones

2005-07-14 Thread Dan Harris



On Jul 14, 2005, at 7:15 PM, John A Meinel wrote:



Is the distribution of your rows uneven? Meaning do you have more rows
with a later id than an earlier one?



There are definitely some id's that will have many times more than  
the others.  If I group and count them, the top 10 are fairly  
dominant in the table.




Hmm.. How to do it permanantly? Well you could always issue "set
join_collapse set 1; select * from "
But obviously that isn't what you prefer. :)

I think there are things you can do to make merge join more expensive
than a nested loop, but I'm not sure what they are.


Maybe someone else has some ideas to encourage this behavior for  
future work?  Setting it on a per-connection basis is doable, but  
would add some burden to us in code.




What I really don't understand is that the estimates dropped as well.
The actual number of estimate rows drops to 3k instead of > 1M.
The real question is why does the planner think it will be so  
expensive?




select count(*) from k_b join k_r using (incidentid) where k_b.id=107
and k_r.id=94;
count
---
   373




Well, this says that they are indeed much more selective.
Each one has > 1k rows, but together you end up with only 400.



Is this a bad thing?  Is this not "selective enough" to make it much  
faster?


Overall, I'm much happier now after seeing the new plan come about,  
if I can find a way to make that join_collapse behavior permanent, I  
can certainly live with these numbers.


Thanks again for your continued efforts.

-Dan

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] slow joining very large table to smaller ones

2005-07-14 Thread Dan Harris



On Jul 14, 2005, at 5:12 PM, John A Meinel wrote:


Dan Harris wrote:




Well, postgres is estimating around 500 rows each, is that way  
off?  Try

just doing:
EXPLAIN ANALYZE SELECT incidentid FROM k_b WHERE id = 107;
EXPLAIN ANALYZE SELECT incidentid FROM k_r WHERE id = 94;



Once again, do this and post the results. We might just need to tweak
your settings so that it estimates the number of rows correctly,  
and we

don't need to do anything else.



Ok, sorry I missed these the first time through:

explain analyze select incidentid from k_b where id = 107;
   QUERY PLAN
 

Index Scan using k_b_idx on k_b  (cost=0.00..1926.03 rows=675  
width=14) (actual time=0.042..298.394 rows=2493 loops=1)

   Index Cond: (id = 107)
Total runtime: 299.103 ms

select count(*) from k_b;
count

698350

( sorry! I think I said this one only had tens of thousands in it )


explain analyze select incidentid from k_r where id =  
94;   QUERY PLAN
 
-
Index Scan using k_r_idx on k_r  (cost=0.00..2137.61 rows=757  
width=14) (actual time=0.092..212.187 rows=10893 loops=1)

   Index Cond: (id = 94)
Total runtime: 216.498 ms
(3 rows)


select count(*) from k_r;
count

671670


That one is quite a bit slower, yet it's the same table structure and  
same index as k_b, also it has fewer records.


I did run VACUUM ANALYZE immediately before running these queries.   
It seems a lot better with the join_collapse set.




\
Well, the planner is powerful enough to flatten nested selects. To  
make

it less "intelligent" you can do:
SET join_collapse_limit 1;
or
SET join_collapse_limit 0;
Which should tell postgres to not try and get tricky with your query.
Again, *usually* the planner knows better than you do. So again  
just do

it to see what you get.



Ok, when join_collapse_limit = 1 I get this now:

explain analyze select recordtext from eventactivity join ( select  
incidentid from k_r join k_b using (incidentid) where k_r.id = 94 and  
k_b.id = 107 ) a  using (incidentid );
   
QUERY PLAN
 
---
Nested Loop  (cost=0.00..156509.08 rows=2948 width=35) (actual  
time=1.555..340.625 rows=24825 loops=1)
   ->  Nested Loop  (cost=0.00..5361.89 rows=6 width=28) (actual  
time=1.234..142.078 rows=366 loops=1)
 ->  Index Scan using k_b_idx on k_b  (cost=0.00..1943.09  
rows=681 width=14) (actual time=0.423..56.974 rows=2521 loops=1)

   Index Cond: (id = 107)
 ->  Index Scan using k_r_idx on k_r  (cost=0.00..5.01  
rows=1 width=14) (actual time=0.031..0.031 rows=0 loops=2521)
   Index Cond: ((k_r.id = 94) AND  
((k_r.incidentid)::text = ("outer".incidentid)::text))
   ->  Index Scan using eventactivity1 on eventactivity   
(cost=0.00..25079.55 rows=8932 width=49) (actual time=0.107..0.481  
rows=68 loops=366)
 Index Cond: ((eventactivity.incidentid)::text =  
("outer".incidentid)::text)

Total runtime: 347.975 ms

MUCH better!  Maybe you can help me understand what I did and if I  
need to make something permanent to get this behavior from now on?








If you have analyzed recently can you do:
SELECT relname, reltuples FROM pg_class WHERE relname='eventactivity';

It is a cheaper form than "SELECT count(*) FROM eventactivity" to  
get an

approximate estimate of the number of rows. But if it isn't too
expensive, please also give the value from SELECT count(*) FROM
eventactivity.

Again, that helps us know if your tables are up-to-date.



Sure:

select relname, reltuples from pg_class where relname='eventactivity';
relname|  reltuples
---+-
eventactivity | 3.16882e+07

select count(*) from eventactivity;
  count
--
31871142













I don't know how selective your keys are, but one of these queries
should probably structure it better for the planner. It depends  
a  lot on

how selective your query is.




eventactivity currently has around 36 million rows in it. There   
should
only be maybe 200-300 incidentids at most that will be matched   
with the
combination of k_b and k_r.  That's why I was thinking I  could  
somehow

get a list of just the incidentids that matched the id  = 94 and id =
107 in k_b and k_r first. Then, I would only need to  grab a few  
hundred

out of 36 million rows from eventactivity.




Well, you can also try:
SELECT count(*) FROM k_b JOIN k_r USING (i

Re: [PERFORM] slow joining very large table to smaller ones

2005-07-14 Thread Dan Harris



On Jul 14, 2005, at 9:42 AM, John A Meinel wrote:



You might try giving it a little bit more freedom with:

EXPLAIN ANALYZE
SELECT recordtext FROM eventactivity, k_r, k_b
 WHERE eventactivity.incidentid = k_r.incidentid
   AND eventactivity.incidentid = k_b.incidentid
   AND k_r.id = 94
   AND k_b.id = 107
-- AND k_r.incidentid = k_b.incidentid
;

I'm pretty sure that would give identical results, just let the  
planner

have a little bit more freedom about how it does it.
Also the last line is commented out, because I think it is redundant.



Ok, I tried this one.  My ssh keeps getting cut off by a router  
somewhere between me and the server due to inactivity timeouts, so  
all I know is that both the select and explain analyze are taking  
over an hour to run.  Here's the explain select for that one, since  
that's the best I can get.


explain select recordtext from eventactivity,k_r,k_b where  
eventactivity.incidentid = k_r.incidentid and  
eventactivity.incidentid = k_b.incidentid and k_r.id = 94 and k_b.id  
= 107;

  QUERY PLAN
 
--

Merge Join  (cost=9624.61..4679590.52 rows=151009549 width=35)
   Merge Cond: (("outer".incidentid)::text = "inner"."?column2?")
   ->  Merge Join  (cost=4766.92..4547684.26 rows=16072733 width=117)
 Merge Cond: (("outer".incidentid)::text = "inner"."?column2?")
 ->  Index Scan using eventactivity1 on eventactivity   
(cost=0.00..4186753.16 rows=46029271 width=49)

 ->  Sort  (cost=4766.92..4771.47 rows=1821 width=68)
   Sort Key: (k_b.incidentid)::text
   ->  Index Scan using k_b_idx on k_b   
(cost=0.00..4668.31 rows=1821 width=68)

 Index Cond: (id = 107)
   ->  Sort  (cost=4857.69..4862.39 rows=1879 width=68)
 Sort Key: (k_r.incidentid)::text
 ->  Index Scan using k_r_idx on k_r  (cost=0.00..4755.52  
rows=1879 width=68)

   Index Cond: (id = 94)
(13 rows)




You might also try:
EXPLAIN ANALYZE
SELECT recordtext
  FROM eventactivity JOIN k_r USING (incidentid)
  JOIN k_b USING (incidentid)
 WHERE k_r.id = 94
   AND k_b.id = 107
;



Similar results here.  The query is taking at least an hour to finish.

explain select recordtext from eventactivity join k_r using  
( incidentid ) join k_b using (incidentid ) where k_r.id = 94 and  
k_b.id = 107;


 QUERY PLAN
 
--

Merge Join  (cost=9542.77..4672831.12 rows=148391132 width=35)
   Merge Cond: (("outer".incidentid)::text = "inner"."?column2?")
   ->  Merge Join  (cost=4726.61..4542825.87 rows=15930238 width=117)
 Merge Cond: (("outer".incidentid)::text = "inner"."?column2?")
 ->  Index Scan using eventactivity1 on eventactivity   
(cost=0.00..4184145.43 rows=46000104 width=49)

 ->  Sort  (cost=4726.61..4731.13 rows=1806 width=68)
   Sort Key: (k_b.incidentid)::text
   ->  Index Scan using k_b_idx on k_b   
(cost=0.00..4628.92 rows=1806 width=68)

 Index Cond: (id = 107)
   ->  Sort  (cost=4816.16..4820.82 rows=1863 width=68)
 Sort Key: (k_r.incidentid)::text
 ->  Index Scan using k_r_idx on k_r  (cost=0.00..4714.97  
rows=1863 width=68)

   Index Cond: (id = 94)
(13 rows)




Also, if possible give us the EXPLAIN ANALYZE so that we know if the
planner is making accurate estimates. (You might send an EXPLAIN while
waiting for the EXPLAIN ANALYZE to finish)

You can also try disabling merge joins, and see how that changes  
things.




Are there any negative sideaffects of doing this?








Well, postgres is estimating around 500 rows each, is that way off?  
Try

just doing:
EXPLAIN ANALYZE SELECT incidentid FROM k_b WHERE id = 107;
EXPLAIN ANALYZE SELECT incidentid FROM k_r WHERE id = 94;

And see if postgres estimates the number of rows properly.

I assume you have recently VACUUM ANALYZEd, which means you might need
to update the statistics target (ALTER TABLE k_b ALTER COLUMN
incidientid SET STATISTICS 100) default is IIRC 10, ranges from  
1-1000,

higher is more accurate, but makes ANALYZE slower.




Right now, it looks like pg is trying to sort the entire   
eventactivity
table for the merge join which is taking several  minutes to do.   
Can I
rephrase this so that it does the searching  through k_r and k_b  
FIRST
and then go into eventactivity using the  index on incidentid?  It  
seems

like that shouldn't be too hard to  make fast but my SQL query skills
are only average.



To me, it looks like it is doing an index scan (on k_b.id) through k_b
first, sorting the results by incidentid, then merge joining that with
eventactivity.

I'm guessing you actually want it to merge k_b and k_r to get extra
selec

Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-14 Thread Dan Harris



On Jul 14, 2005, at 9:47 AM, Alvaro Herrera wrote:


On Thu, Jul 14, 2005 at 12:28:05AM -0600, Dan Harris wrote:


.  Ext3 must really be crappy
for postgres, or at least is on this box.


Were you using the default journal settings for ext3?


Yes, I was.  Next time I get a chance to reboot this box, I will try  
writeback and compare the benchmarks to my previous config.  Thanks  
for the tip.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[PERFORM] slow joining very large table to smaller ones

2005-07-14 Thread Dan Harris


I'm trying to improve the speed of this query:

explain select recordtext from eventactivity inner join ( select  
incidentid from k_r where id = 94 ) a using ( incidentid ) inner join  
( select incidentid from k_b where id = 107 ) b using ( incidentid );

  QUERY PLAN
 
--

Merge Join  (cost=2747.29..4249364.96 rows=11968693 width=35)
   Merge Cond: (("outer".incidentid)::text = "inner"."?column2?")
   ->  Merge Join  (cost=1349.56..4230052.73 rows=4413563 width=117)
 Merge Cond: (("outer".incidentid)::text = "inner"."?column2?")
 ->  Index Scan using eventactivity1 on eventactivity   
(cost=0.00..4051200.28 rows=44519781 width=49)

 ->  Sort  (cost=1349.56..1350.85 rows=517 width=68)
   Sort Key: (k_b.incidentid)::text
   ->  Index Scan using k_b_idx on k_b   
(cost=0.00..1326.26 rows=517 width=68)

 Index Cond: (id = 107)
   ->  Sort  (cost=1397.73..1399.09 rows=542 width=68)
 Sort Key: (k_r.incidentid)::text
 ->  Index Scan using k_r_idx on k_r  (cost=0.00..1373.12  
rows=542 width=68)

   Index Cond: (id = 94)
(13 rows)


There are many millions of rows in eventactivity.  There are a few  
ten-thousand rows in k_r and k_b.  There is an index on 'incidentid'  
in all three tables.  There should only be less than 100 rows matched  
in k_r and k_b total.  That part on its own is very very fast.  But,  
it should have those 100 or so incidentids extracted in under a  
second and then go into eventactivity AFTER doing that.  At least,  
that's my intention to make this fast.


Right now, it looks like pg is trying to sort the entire  
eventactivity table for the merge join which is taking several  
minutes to do.  Can I rephrase this so that it does the searching  
through k_r and k_b FIRST and then go into eventactivity using the  
index on incidentid?  It seems like that shouldn't be too hard to  
make fast but my SQL query skills are only average.


Thanks
-Dan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-13 Thread Dan Harris



On Jul 14, 2005, at 12:12 AM, Greg Stark wrote:


Dan Harris <[EMAIL PROTECTED]> writes:



I keep the entire database vacuumed regularly.



How often is "regularly"?
Well, once every day, but there aren't a ton of inserts or updates  
going on a daily basis.  Maybe 1,000 total inserts?


Also, if you've done occasional massive batch updates like you  
describe here
you may need a VACUUM FULL or alternatively a CLUSTER command to  
compact the
table -- vacuum identifies the free space but if you've doubled the  
size of
your table with a large update that's a lot more free space than  
you want

hanging around waiting to be used.

I have a feeling I'm going to need to do a cluster soon.  I have done  
several mass deletes and reloads on it.




For example, as I'm writing this, I am running an UPDATE  
statement  that will
affect a small part of the table, and is querying on an  indexed  
boolean field.



...

update eventactivity set ftindex = false where ftindex = true;   
( added the

where clause because I don't want to alter where ftindex  is null )



It's definitely worthwhile doing an "EXPLAIN UPDATE..." to see if  
this even

used the index. It sounds like it did a sequential scan.



I tried that, and indeed it was using an index, although after  
reading Simon's post, I realize that was kind of dumb to have an  
index on a bool. I have since removed it.


Sequential scans during updates are especially painful. If there  
isn't free
space lying around in the page where the updated record lies then  
another page
has to be used or a new page added. If you're doing a massive  
update you can
exhaust the free space available making the update have to go back  
and forth
between the page being read and the end of the table where pages  
are being

written.


This is great info, thanks.





#

vmstat output ( as I am waiting for this to finish ):
procs ---memory-- ---swap-- -io --system--
cpu
r  b   swpd   freebuff   cache   si   sobibo   in 
cs  us sy id wa
0  1   5436 2823908  26140 918370401  2211   540  694
336   9  2 76 13




[I assume you ran "vmstat 10" or some other interval and then  
waited for at

least the second line? The first line outputted from vmstat is mostly
meaningless]


Yeah, this was at least 10 or so down the list ( the last one before  
ctrl-c )




Um. That's a pretty meager i/o rate. Just over 2MB/s. The cpu is  
76% idle
which sounds fine but that could be one processor pegged at 100%  
while the
others are idle. If this query is the only one running on the  
system then it

would behave just like that.
Well, none of my processors had ever reached 100% until I changed to  
ext2 today ( read below for more info )


Is it possible you have some foreign keys referencing these records  
that
you're updating? In which case every record being updated might be  
causing a
full table scan on another table (or multiple other tables). If  
those tables
are entirely in cache then it could cause these high cpu low i/o  
symptoms.




No foreign keys or triggers.


Ok, so I remounted this drive as ext2 shortly before sending my first  
email today.  It wasn't enough time for me to notice the ABSOLUTELY  
HUGE difference in performance change.  Ext3 must really be crappy  
for postgres, or at least is on this box.  Now that it's ext2, this  
thing is flying like never before.   My CPU utilization has  
skyrocketed, telling me that the disk IO was constraining it immensely.


I always knew that it might be a little faster, but the box feels  
like it can "breathe" again and things that used to be IO intensive  
and run for an hour or more are now running in < 5 minutes.  I'm a  
little worried about not having a journalized file system, but that  
performance difference will keep me from switching back ( at least to  
ext3! ).  Maybe someday I will try XFS.


I would be surprised if everyone who ran ext3 had this kind of  
problem, maybe it's specific to my kernel, raid controller, I don't  
know.  But, this is amazing.  It's like I have a new server.


Thanks to everyone for their valuable input and a big thanks to all  
the dedicated pg developers on here who make this possible!


-Dan


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-13 Thread Dan Harris



On Jul 13, 2005, at 2:17 PM, Stephen Frost wrote:


Could you come up w/ a test case that others could reproduce where
explain isn't returning?


This was simply due to my n00bness :)  I had always been doing  
explain analyze, instead of just explain.  Next time one of these  
queries comes up, I will be sure to do the explain without analyze.


FYI that update query I mentioned in the initial thread just finished  
after updating 8.3 million rows.


-Dan


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-13 Thread Dan Harris



On Jul 13, 2005, at 1:11 PM, John A Meinel wrote:



I might be wrong, but there may be something much more substantially
wrong than slow i/o.
John



Yes, I'm afraid of that too.  I just don't know what tools I should  
use to figure that out.  I have some 20 other databases on this  
system, same schema but varying sizes, and the small ones perform  
very well.  It feels like there is an O(n) increase in wait time that  
has recently become very noticeable on the largest of them.


-Dan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-13 Thread Dan Harris


So sorry, I forgot to mention I'm running version 8.0.1

Thanks


---(end of broadcast)---
TIP 6: explain analyze is your friend

[PERFORM] Quad Opteron stuck in the mud

2005-07-13 Thread Dan Harris


Gurus,

A table in one of my databases has just crossed the 30 million row  
mark and has begun to feel very sluggish for just about anything I do  
with it.  I keep the entire database vacuumed regularly.  And, as  
long as I'm not doing a sequential scan, things seem reasonably quick  
most of the time.  I'm now thinking that my problem is IO because  
anything that involves heavy ( like a seq scan ) IO seems to slow to  
a crawl.  Even if I am using indexed fields to grab a few thousand  
rows, then going to sequential scans it gets very very slow.


I have also had the occurrence where queries will not finish for days  
( I eventually have to kill them ).  I was hoping to provide an  
explain analyze for them, but if they never finish... even the  
explain never finishes when I try that.


For example, as I'm writing this, I am running an UPDATE statement  
that will affect a small part of the table, and is querying on an  
indexed boolean field.


I have been waiting for over an hour and a half as I write this and  
it still hasn't finished.  I'm thinking "I bet Tom, Simon or Josh  
wouldn't put up with this kind of wait time..", so I thought I would  
see if anyone here had some pointers.  Maybe I have a really stupid  
setting in my conf file that is causing this.  I really can't believe  
I am at the limits of this hardware, however.



The query:
update eventactivity set ftindex = false where ftindex = true;  
( added the where clause because I don't want to alter where ftindex  
is null )




The table:
  Column|Type | Modifiers
-+-+---
entrydate   | timestamp without time zone |
incidentid  | character varying(40)   |
statustype  | character varying(20)   |
unitid  | character varying(20)   |
recordtext  | character varying(255)  |
recordtext2 | character varying(255)  |
insertdate  | timestamp without time zone |
ftindex | boolean |
Indexes: eventactivity1 btree (incidentid),
 eventactivity_entrydate_idx btree (entrydate),
 eventactivity_ftindex_idx btree (ftindex),
 eventactivity_oid_idx btree (oid)




The hardware:

4 x 2.2GHz Opterons
12 GB of RAM
4x10k 73GB Ultra320 SCSI drives in RAID 0+1
1GB hardware cache memory on the RAID controller

The OS:
Fedora, kernel 2.6.6-1.435.2.3smp ( redhat stock kernel )
filesystem is mounted as ext2

#

vmstat output ( as I am waiting for this to finish ):
procs ---memory-- ---swap-- -io --system--  
cpu
r  b   swpd   free   buff  cache   si   sobibo   incs us  
sy id wa
0  1   5436 2823908  26140 918370401  2211   540  694   336   
9  2 76 13


#

iostat output ( as I am waiting for this to finish ):
avg-cpu:  %user   %nice%sys %iowait   %idle
   9.190.002.19   13.08   75.53

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
cciss/c0d0  329.26 17686.03  4317.57  161788630   39496378


#
This is a dedicated postgresql server, so maybe some of these  
settings are more liberal than they should be?


relevant ( I hope ) postgresql.conf options are:

shared_buffers = 5
effective_cache_size = 1348000
random_page_cost = 3
work_mem = 512000
max_fsm_pages = 8
log_min_duration_statement = 6
fsync = true ( not sure if I'm daring enough to run without this )
wal_buffers = 1000
checkpoint_segments = 64
checkpoint_timeout = 3000


# FOR PG_AUTOVACUUM --#
stats_command_string = true
stats_row_level = true

Thanks in advance,
Dan








---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[PERFORM] investigating slow queries through pg_stat_activity

2005-06-20 Thread Dan Harris

I've got some queries generated by my application that will, for some  
reason, run forever until I kill the pid.  Yet, when I run the  
queries manually to check them out, they usually work fine.  To get  
more information about these queries, I'm writing a utility to take  
snapshots of pg_stat_activity every 5 minutes.  If it finds a query  
that runs for longer than 15 minutes, it will trap the query so I can  
run 'explain analyze' on it and see where the weakness is.


However, the problem I have is that pg_stat_activity only returns the  
first n (255?) characters of the SQL as "current_query", so it gets  
chopped off at the end.  I would very much like to find out how I can  
get the *entire* query that is active.  Is this possible?


Also, I'm sure some people will respond with "turn on query  
logging".. I've explored that option and the formatting of the log  
file and the fact that EVERY query is logged is not what I'm after  
for this project.  The "infinite-running" queries are unpredictable  
and may only happen once a week.  Logging 24/7 in anticipation of one  
of these occurrences is not something I'd like to do.


Thanks,

Dan Harris

---(end of broadcast)---
TIP 6: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Query tuning help

2005-05-08 Thread Dan Harris

On May 8, 2005, at 8:32 PM, Russell Smith wrote:
I have run this, and while it is very fast, I'm concerned it's not
doing what I need.
How fast is very fast?
It took 35 seconds to complete versus ~450 my old way.
select distinct em.incidentid, ea.recordtext as retdata, eg.long, 
eg.lat
FROM em JOIN ea ON (em.incidentid = ea.incidentid AND em.entrydate >=  
'2005-1-1 00:00'
AND em.entrydate <= '2005-5-9 00:00' AND (ea.recordtext like '%RED%'  
OR ea.recordtext like '%CORVETTE%'))
JOIN eg ON em.incidentid = eg.incidentid WHERE
em.incidentid IN
(select distinct em.incidentid, ea.recordtext as retdata, eg.long, 
eg.lat
FROM em JOIN ea ON (em.incidentid = ea.incidentid AND em.entrydate >=  
'2005-1-1 00:00'
AND em.entrydate <= '2005-5-9 00:00' AND ea.recordtext like 
'%CORVETTE%'))
JOIN eg ON em.incidentid = eg.incidentid)  AND
em.incidentid IN
(select distinct em.incidentid, ea.recordtext as retdata, eg.long, 
eg.lat
FROM em JOIN ea ON (em.incidentid = ea.incidentid AND em.entrydate >=  
'2005-1-1 00:00'
AND em.entrydate <= '2005-5-9 00:00' AND ea.recordtext like '%RED%'))
JOIN eg ON em.incidentid = eg.incidentid)

Yes, it is nasty, but so was my previous query :)  So long as this is 
faster, I'm ok with that.  I'll see if i can make this work.  Thank you 
very much.

-Dan
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Query tuning help

2005-05-08 Thread Dan Harris

On May 8, 2005, at 8:06 PM, Josh Berkus wrote:

If I were to use tsearch2 for full-text indexing, would I need to
create another table that merges all of my recordtext rows into a
single 'text' field type?
No.   Read the OpenFTS docs, they are fairly clear on how to set up a 
simple
FTS index. (TSearch2 ~~ OpenFTS)

If so, this is where I run into problems, as
my logic also needs to match multiple words in their original order.
I have been reading the Tsearch2 docs and either I don't understand 
something or I'm not communicating my situation clearly enough.  It 
seems that Tsearch2 has a concept of "document".  And, in everything I 
am reading, they expect your "document" to be all contained in a single 
row.  Since my words can be spread across multiple rows, I don't see 
that Tsearch2 will combine all 'recordtext' row values with the same 
"incidentid" into a single vector.  Am I overlooking something in the 
docs?

I'm doing something fairly similar on one of my projects and it works 
very
well.

I'd be curious what similarities they have?  Is it the searching across 
multiple rows or the order of words?

The limitations on TSearch2 indexes are:
1) they are expensive to update, so your data loads would be noticably 
slower.
2) they are only fast when cached in RAM (and when cached, are *very* 
fast).
So if you have a variety of other processes that tend to fill up RAM 
between
searches, you may find them less useful.
3) You have to create a materialized index column next to recordtext, 
which
will increase the size of the table.
Duly noted.  If this method can search across rows, I'm willing to 
accept this overhead for the speed it would add.

In the meantime, is there any way I can reach my goal without Tsearch2 
by just restructuring my query to narrow down the results by date 
first, then seq scan for the 'likes'?

-Dan
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Query tuning help

2005-05-08 Thread Dan Harris

On May 8, 2005, at 6:51 PM, Russell Smith wrote:
On Mon, 9 May 2005 09:20 am, Dan Harris wrote:
You cannot use an index for %CORVETTE%, or %RED%.  There is no way
for the index to know if a row had that in the middle without scanning  
the whole
index.  So it's much cheaper to do a sequence scan.

While I believe you, I'm confused by this line in my original EXPLAIN  
ANALYZE:

->  Index Scan using ea1 on ea  (cost=0.00..2736.43 rows=42 width=47)  
(actual time=2.085..2.309 rows=2 loops=473)
Index Cond:  
((ea.incidentid)::text = ("outer".incidentid)::text)
Filter: (((recordtext)::text  
~~ '%RED%'::text) OR ((recordtext)::text ~~ '%CORVETTE%'::text))
Doesn't that mean it was using an index to filter?  Along those lines,  
before I created index 'ea1', the query was much much slower.  So, it  
seemed like creating this index made a difference.

One possible way to make the query faster is to limit based on date,  
as you will only get about 700 rows.
And then don't use subselects, as they are doing full sequence scans.   
I think this query does what you do
above, and I think it will be faster, but I don't know.

I REALLY like this idea! If I could just filter by date first and then  
sequential scan through those, it should be very manageable.  Hopefully  
I can keep this goal while still accommodating the requirement listed  
in my next paragraph.

select distinct em.incidentid, ea.recordtext as retdata, eg.long,   
eg.lat
FROM em JOIN ea ON (em.incidentid = ea.incidentid AND em.entrydate >=  
'2005-1-1 00:00'
AND em.entrydate <= '2005-5-9 00:00' AND ea.recordtext like '%RED%'  
AND ea.recordtext like '%CORVETTE%')
JOIN eg ON em.incidentid = eg.incidentid WHERE (recordtext like  
'%RED%' or recordtext like '%CORVETTE%'  );

I have run this, and while it is very fast, I'm concerned it's not  
doing what I need.  Here's the situation:

Due to the format of the systems with which I integrate ( I have no  
control over these formats ), we will get these 'recordtext' values one  
line at a time, accumulating over time.  The only way I can find to  
make this work is to insert a new record for each line.  The problem  
is, that when someone wants to search multiple keywords, they expect  
these words to be matched across multiple records with a given incident  
number.

 For a very simple example:
IncidentID		DateRecordtext
--		-			 
---
1			2005-05-01 14:21	blah blah blah RED blah blah
			2005-05-01 14:23	not what we are looking for
1			2005-05-02 02:05	blah CORVETTE blah blah

So, doing a search with an 'and' condition, e.g. WHERE RECORDTEXT LIKE  
'%RED%' AND RECORDTEXT LIKE '%CORVETTE%' , will not match because the  
condition will only be applied to a single row of recordtext at a time,  
not a whole group with the same incident number.

If I were to use tsearch2 for full-text indexing, would I need to  
create another table that merges all of my recordtext rows into a  
single 'text' field type?  If so, this is where I run into problems, as  
my logic also needs to match multiple words in their original order.  I  
may also receive additional updates to the previous data.  In that  
case, I need to replace the original record with the latest version of  
it.  If I have already concatenated these rows into a single field, the  
logic to in-line replace only the old text that has changed is very  
very difficult at best.  So, that's the reason I had to do two  
subqueries in my example.  Please tell me if I misunderstood your logic  
and it really will match given my condition above, but it didn't seem  
like it would.

Thanks again for the quick responses!  This list has been a great  
resource for me.

-Dan
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

[PERFORM] Query tuning help

2005-05-08 Thread Dan Harris

Sorry to bother everyone with yet another "my query isn't using an 
index" problem but I am over my head on this one..  I am open to ways 
of restructuring this query to perform better.

I have a table, 'ea', with 22 million rows in it.  VACUUM ANALYZE has 
been just run on the table.

This is the result of:
explain analyze
select distinct
em.incidentid,
ea.recordtext as retdata,
eg.long,
eg.lat
from
ea, em, eg
where
em.incidentid = ea.incidentid and
em.incidentid = eg.incidentid and
em.entrydate >= '2005-1-1 00:00' and
em.entrydate <= '2005-5-9 00:00'
and ea.incidentid in (
select
incidentid
from
ea
where
recordtext like '%RED%'
)
	and ea.incidentid in (
		select
			incidentid
		from
			ea
		where
			recordtext like '%CORVETTE%'
	)
	and (  recordtext like '%RED%' or recordtext like '%CORVETTE%'  )  
order by em.entrydate

-
ANALYZE RESULTS
-
 Unique  (cost=774693.72..774693.76 rows=1 width=159) (actual 
time=446787.056..446787.342 rows=72 loops=1)
   ->  Sort  (cost=774693.72..774693.72 rows=1 width=159) (actual 
time=446787.053..446787.075 rows=72 loops=1)
 Sort Key: em.incidentid, public.ea.recordtext, eg.long, eg.lat
 ->  Nested Loop  (cost=771835.10..774693.71 rows=1 width=159) 
(actual time=444378.655..446786.746 rows=72 loops=1)
   ->  Nested Loop  (cost=771835.10..774688.81 rows=1 
width=148) (actual time=444378.532..446768.381 rows=72 loops=1)
 ->  Nested Loop IN Join  
(cost=771835.10..774678.88 rows=2 width=81) (actual 
time=444367.080..446191.864 rows=701 loops=1)
   ->  Nested Loop  (cost=771835.10..774572.05 
rows=42 width=64) (actual time=444366.859..445463.232 rows=1011 
loops=1)
 ->  HashAggregate  
(cost=771835.10..771835.10 rows=1 width=17) (actual 
time=444366.702..444368.583 rows=473 loops=1)
   ->  Seq Scan on ea  
(cost=0.00..771834.26 rows=335 width=17) (actual 
time=259.746..444358.837 rows=592 loops=1)
 Filter: 
((recordtext)::text ~~ '%CORVETTE%'::text)
 ->  Index Scan using ea1 on ea  
(cost=0.00..2736.43 rows=42 width=47) (actual time=2.085..2.309 rows=2 
loops=473)
   Index Cond: 
((ea.incidentid)::text = ("outer".incidentid)::text)
   Filter: (((recordtext)::text ~~ 
'%RED%'::text) OR ((recordtext)::text ~~ '%CORVETTE%'::text))
   ->  Index Scan using ea1 on ea  
(cost=0.00..2733.81 rows=42 width=17) (actual time=0.703..0.703 rows=1 
loops=1011)
 Index Cond: 
(("outer".incidentid)::text = (ea.incidentid)::text)
 Filter: ((recordtext)::text ~~ 
'%RED%'::text)
 ->  Index Scan using em_incidentid_idx on em  
(cost=0.00..4.95 rows=1 width=67) (actual time=0.820..0.821 rows=0 
loops=701)
   Index Cond: (("outer".incidentid)::text = 
(em.incidentid)::text)
   Filter: ((entrydate >= '2005-01-01 
00:00:00'::timestamp without time zone) AND (entrydate <= '2005-05-09 
00:00:00'::timestamp without time zone))
   ->  Index Scan using eg_incidentid_idx on eg  
(cost=0.00..4.89 rows=1 width=79) (actual time=0.245..0.246 rows=1 
loops=72)
 Index Cond: (("outer".incidentid)::text = 
(eg.incidentid)::text)
 Total runtime: 446871.880 ms
(22 rows)

-
EXPLANATION
-
The reason for the redundant LIKE clause is that first, I only want 
those "incidentid"s that contain the words 'RED' and 'CORVETTE'.  BUT, 
those two words may exist across multiple records with the same 
incidentid.  Then, I only want to actually work with the rows that 
contain one of the words.  This query will repeat the same logic for 
however many keywords are entered by the user.  I have investigated 
text searching options and have not found them to be congruous with my 
application.

Why is it choosing a sequential scan one part of the query when 
searching for the words, yet using an index scan for another part of 
it?  Is there a better way to structure the query to give it better 
hints?

I'm using 8.0.1 on a 4-way Opteron with beefy RAID-10 and 12GB of RAM.
Thank you for any advice.
-Dan

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

[PERFORM] Confusion about locales and 'like' indexes

2005-03-01 Thread Dan Harris

Greetings,
I have been beating myself up today trying to optimize indices for a 
query that uses LIKE.  In my research I have read that the locale 
setting may affect PostgreSQL's choice of seq scan vs index scan.  I am 
running Fedora Core 2 and it appears when I run "locale" that it is set 
to 'en.US-UTF-8'.

Did I fall into a "gotcha" trap here about C vs non-C locales?  I'm not 
much of a C programmer so I have no idea what all this touches and 
everything has been left as default during PG compilation as well as 
Fedora install.   I can pg_dump and initdb again with --locale=C if 
this will allow my LIKE queries to use indexes, but I just wanted to 
know if there was some other place I needed to change locales in the 
system? e.g. postgresql.conf or env vars?  Or, would the initdb and 
reload alone fix it?

I'm running 8.0.1 if that matters.
Thanks
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

[PERFORM] Hardware opinions wanted

2004-05-27 Thread Dan Harris

I wanted to solicit some opinions on architecture and performance from 
you guys.

I am torn right now between these two systems to replace my aging DB server:
4 x 2.2 GHz Opteron
8GB RAM
Ultra320 15kRPM RAID5 with 128MB cache
and
2-way 1.2GHz POWER4+ IBM pSeries 615
8GB RAM
Ultra320 15kRPM RAID5 with 64MB cache
I plan on serving ~80GB of pgsql database on this machine.  The current 
machine handles around 1.5 million queries per day.

I am having some trouble finding direct comparisons between the two 
architectures.  The OS will most likely be Linux ( I'm hedging on AIX 
for now ).  The pSeries has 8MB cache per CPU card ( 2 CPU on a card ) 
while the Opteron has 1MB for each processor.  I know the POWER4+ is a 
very fast machine but I wonder if having more processors in the Opteron 
system would beat it for database serving? FWIW, they are very close in 
price.

Ignoring the fault-tolerance features of the pSeries, which one would 
you pick for performance?

Thanks,
Dan
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [PERFORM] tuning for AIX 5L with large memory

2004-05-23 Thread Dan Harris

Christopher Browne wrote:
We have a couple of these at work; they're nice and fast, although the
process of compiling things, well, "makes me feel a little unclean."
 

Thanks very much for your detailed reply, Christopher.  Would you mind 
elaborating on the "makes me feel a little unclean" statement? Also, I'm 
curious which models you are running and if you have any anecdotal 
comparisons for perfomance?  I'm completely unfamiliar with AIX, so if 
there are dark corners that await me, I'd love to hear a little more so 
I can be prepared.  I'm going out on a limb here and jumping to an 
unfamiliar architecture as well as OS, but the IO performance of these 
systems has convinced me that it's what I need to break out of my I/O 
limited x86 systems.

I suppose when I do get it, I'll just experiment with different sizes of 
shared memory and run some benchmarks.  For the price of these things, 
they better be some good marks!

Thanks again
-Dan Harris
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[PERFORM] tuning for AIX 5L with large memory

2004-05-21 Thread Dan Harris

I will soon have at my disposal a new IBM pSeries server.  The main 
mission for this box will be to serve several  pg databases.  I have 
ordered 8GB of RAM and want to learn the best way to tune pg and AIX for 
this configuration.  Specifically, I am curious about shared memory 
limitations.  I've had to tune the shmmax on linux machines before but 
I'm new to AIX and not sure if this is even required on that platform?  
Google has not been much help for specifics here. 

Hoping someone else here has a similar platform and can offer some advice..
Thanks!
-Dan Harris
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

80 matches

Mail list logo