[PERFORM] Contemplating SSD Hardware RAID
I'm looking for advice from the I/O gurus who have been in the SSD game for a while now. I understand that the majority of consumer grade SSD drives lack the required capacitor to complete a write on a sudden power loss. But, what about pairing up with a hardware controller with BBU write cache? Can the write cache be disabled at the drive and result in a safe setup? I'm exploring the combination of an Areca 1880ix-12 controller with 6x OCZ Vertex 3 V3LT-25SAT3 2.5" 240GB SATA III drives in RAID-10. Has anyone tried this combination? What nasty surprise am I overlooking here? Thanks -Dan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Linux I/O schedulers - CFQ & random seeks
On 3/4/11 11:03 AM, Wayne Conrad wrote: On 03/04/11 10:34, Glyn Astill wrote: > I'm wondering (and this may be a can of worms) what peoples opinions are on these schedulers? When testing our new DB box just last month, we saw a big improvement in bonnie++ random I/O rates when using the noop scheduler instead of cfq (or any other). We've got RAID 10/12 on a 3ware card w/ battery-backed cache; 7200rpm drives. Our file system is XFS with noatime,nobarrier,logbufs=8,logbsize=256k. How much is "big?" I can't find my notes for it, but I recall that the difference was large enough to surprise us. We're running with noop in production right now. No complaints. Just another anecdote, I found that the deadline scheduler performed the best for me. I don't have the benchmarks anymore but deadline vs cfq was dramatically faster for my tests. I posted this to the list years ago and others announced similar experiences. Noop was a close 2nd to deadline. XFS (noatime,nodiratime,nobarrier,logbufs=8) 391GB db cluster directory BBU Caching RAID10 12-disk SAS 128GB RAM Constant insert stream OLAP-ish query patterns Heavy random I/O -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Slow count(*) again...
On 10/12/10 4:33 PM, Neil Whelchel wrote: On Tuesday 12 October 2010 08:39:19 Dan Harris wrote: On 10/11/10 8:02 PM, Scott Carey wrote: would give you a 1MB read-ahead. Also, consider XFS and its built-in defragmentation. I have found that a longer lived postgres DB will get extreme file fragmentation over time and sequential scans end up mostly random. On-line file defrag helps tremendously. We just had a corrupt table caused by an XFS online defrag. I'm scared to use this again while the db is live. Has anyone else found this to be safe? But, I can vouch for the fragmentation issue, it happens very quickly in our system. -Dan I would like to know the details of what was going on that caused your problem. I have been using XFS for over 9 years, and it has never caused any trouble at all in a production environment. Sure, I had many problems with it on the test bench, but in most cases the issues were very clear and easy to avoid in production. There were some (older) XFS tools that caused some problems, but that is in the past, and as time goes on, it seems take less and less planning to make it work properly. -Neil- There were roughly 50 transactions/sec going on at the time I ran it. xfs_db reported 99% fragmentation before it ran ( we haven't been running it via cron ). The operation completed in about 15 minutes ( 360GB of used data on the file system ) with no errors. Everything seemed fine until the next morning when a user went to query a table we got a message about a "missing" file inside the pg cluster. We were unable to query the table at all via psql. It was a bit of a panic situation so we restored that table from backup immediately and the problem was solved without doing more research. This database has been running for years with no problem ( and none since ), that was the first time I tried to do an on-line defrag and that was the only unusual variable introduced into the system at that time so it was a strong enough correlation for me to believe that caused it. Hopefully this was just a corner case.. -Dan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Slow count(*) again...
On 10/12/10 10:44 AM, Scott Carey wrote: On Oct 12, 2010, at 8:39 AM, Dan Harris wrote: On 10/11/10 8:02 PM, Scott Carey wrote: would give you a 1MB read-ahead. Also, consider XFS and its built-in defragmentation. I have found that a longer lived postgres DB will get extreme file fragmentation over time and sequential scans end up mostly random. On-line file defrag helps tremendously. We just had a corrupt table caused by an XFS online defrag. I'm scared to use this again while the db is live. Has anyone else found this to be safe? But, I can vouch for the fragmentation issue, it happens very quickly in our system. What version? I'm using the latest CentoOS extras build. We've been doing online defrag for a while now on a very busy database with> 8TB of data. Not that that means there are no bugs... It is a relatively simple thing in xfs -- it writes a new file to temp in a way that allocates contiguous space if available, then if the file has not been modified since it was re-written it is essentially moved on top of the other one. This should be safe provided the journaling and storage is safe, etc. I'm not sure how to figure out what version of XFS we're on.. but it's Linux kernel 2.6.24-24 x86_64 on Ubuntu Server 8.04.3. Postgres version 8.3 We're due for an upgrade on that server soon so we'll do some more testing once we upgrade. Right now we are just living with the fragmentation. I'm glad to hear the regular on-line defrag is working successfully, at least that gives me hope we can rely on it in the future. -Dan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Slow count(*) again...
On 10/11/10 8:02 PM, Scott Carey wrote: would give you a 1MB read-ahead. Also, consider XFS and its built-in defragmentation. I have found that a longer lived postgres DB will get extreme file fragmentation over time and sequential scans end up mostly random. On-line file defrag helps tremendously. We just had a corrupt table caused by an XFS online defrag. I'm scared to use this again while the db is live. Has anyone else found this to be safe? But, I can vouch for the fragmentation issue, it happens very quickly in our system. -Dan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] large dataset with write vs read clients
On 10/7/10 11:47 AM, Aaron Turner wrote: Basically, each connection is taking about 100MB resident. As we need to increase the number of threads to be able to query all the devices in the 5 minute window, we're running out of memory. I think the first thing to do is look into using a connection pooler like pgpool to reduce your connection memory overhead. -Dan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Got that new server, now it's time for config!
On 3/22/10 4:36 PM, Carlo Stonebanks wrote: Here we go again! Can anyone see any obvious faults? Carlo maintenance_work_mem = 256MB I'm not sure how large your individual tables are, but you might want to bump this value up to get faster vacuums. max_fsm_relations = 1000 I think this will definitely need to be increased work_mem = 64MB Most data warehousing loads I can think of will need more work_mem, but this depends on how large of data sets you are planning to sort. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] The state of PG replication in 2008/Q2?
My company finally has the means to install a new database server for replication. I have Googled and found a lot of sparse information out there regarding replication systems for PostgreSQL and a lot of it looks very out-of-date. Can I please get some ideas from those of you that are currently using fail-over replication systems? What advantage does your solution have? What are the "gotchas" I need to worry about? My desire would be to have a parallel server that could act as a hot standby system with automatic fail over in a multi-master role. If our primary server goes down for whatever reason, the secondary would take over and handle the load seamlessly. I think this is really the "holy grail" scenario and I understand how difficult it is to achieve. Especially since we make frequent use of sequences in our databases. If MM is too difficult, I'm willing to accept a hot-standby read-only system that will handle queries until we can fix whatever ails the master. We are primary an OLAP environment but there is a constant stream of inserts into the databases. There are 47 different databases hosted on the primary server and this number will continue to scale up to whatever the server seems to support. The reason I mention this number is that it seems that those systems that make heavy use of schema changes require a lot of "fiddling". For a single database, this doesn't seem too problematic, but any manual work involved and administrative overhead will scale at the same rate as the database count grows and I certainly want to minimize as much fiddling as possible. We are using 8.3 and the total combined size for the PG data directory is 226G. Hopefully I didn't neglect to include more relevant information. As always, thank you for your insight. -Dan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query performance question
Kenneth Marshall wrote: Dan, Did you try this with 8.3 and its new HOT functionality? Ken I did not. I had to come up with the solution before we were able to move to 8.3. But, Tom did mention that the HOT might help and I forgot about that when writing the prior message. I'm in the midst of moving 30 databases from 8.0 to 8.3 at the moment but when I'm finished, I might have time to test it. -Dan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query performance question
[EMAIL PROTECTED] wrote: 3) Build a table with totals or maybe subtotals, updated by triggers. This requires serious changes in application as well as in database, but solves issues of 1) and may give you even better results. Tomas I have tried this. It's not a magic bullet. We do our billing based on counts from huge tables, so accuracy is important to us. I tried implementing such a scheme and ended up abandoning it because the summary table became so full of dead tuples during and after large bulk inserts that it slowed down selects on that table to an unacceptable speed. Even with a VACUUM issued every few hundred inserts, it still bogged down due to the constant churn of the inserts. I ended up moving this count tracking into the application level. It's messy and only allows a single instance of an insert program due to the localization of the counts in program memory, but it was the only way I found to avoid the penalty of constant table churn on the triggered inserts. -Dan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Planning a new server - help needed
Laszlo Nagy wrote: Question 4. How to make the partitions? This is the hardest question. Here is my plan: - the OS resides on 2 disks, RAID 1 - the databases should go on 8 disks, RAID 0 + 1 Make sure you understand the difference between RAID 1+0 and RAID 0+1.. I suspect you'll end up going with 1+0 instead. -Dan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Minimizing dead tuples caused by update triggers
I've been fighting with the common workarounds for inadequate response times on select count(*) and min(),max() on tables with tens of millions of rows for quite a while now and understand the reasons for the table scans. I have applications that regularly poll a table ( ideally, the more frequent, the better ) to learn the most recent data inside it as well as the # of rows in it ( among a few other things ). As the databases have grown in size, these summarizations could no longer be done on the fly, so I wrote a database wrapper API that tracks those values internally. This wrapper has grown very complex and is difficult to manage across different systems. What I'd like to do instead is implement triggers for insert, updates, and deletes to check and/or replace a value in a "table_stats", representing table count, min/max dates, and a few other costly operations.. that can then be queried in short order. I know this is a fairly common thing to do. The thing that concerns me is dead tuples on the table_stats table. I believe that every insert of new data in one of the monitored tables will result in an UPDATE of the table_stats table. When thousands ( or millions ) of rows are inserted, the select performance ( even trying with an index ) on table_stats slows down in a hurry. If I wrap the inserts into large transactions, will it only call the update on table_states when I commit? Obviously I want to vacuum this table regularly to recover this. The problem I'm running into is contention between VACUUM ( not full ) and pg_dump ( version 8.0.12 ). My system backups takes 6 hours to run pg_dump on a 400GB cluster directory. If the vacuum command fires during the dump, it forces an exclusive lock and any queries will hang until pg_dump finishes. If I have to wait until pg_dump is finished before issuing the VACUUM command, everything slows down significantly as the dead tuples in table_stats pile up. What strategy could I employ to either: 1. resolve the contention between pg_dump and vacuum, or 2. reduce the dead tuple pile up between vacuums Thanks for reading -Dan ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] pg_dump blocking create database?
My PG server came to a screeching halt yesterday. Looking at top saw a very large number of "startup waiting" tasks. A pg_dump was running and one of my scripts had issued a CREATE DATABASE command. It looks like the CREATE DATABASE was exclusive but was having to wait for the pg_dump to finish, causing a massive traffic jam of locks behind it. Once I killed the pg_dump process, things returned to normal. Version 8.0.12. Is this a bug? It also concerns me because my workload is likely to expose this problem again. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Performance problem with table containing a lot of text (blog)
Kari Lavikka wrote: Hello! Some background info.. We have a blog table that contains about eight million blog entries. Average length of an entry is 1200 letters. Because each 8k page can accommodate only a few entries, every query that involves several entries causes several random seeks to disk. We are having problems with queries like: 1) give me a list of months when I have written someting 2) give me id's of entries I have written on month X year X 3) give me the number of blog entries my friends have written since last time I didn't see your schema, but couldn't these problems be solved by storing the article id, owner id, and blog date in a separate table? It seems that if you don't actually need the content of the blogs, all of those questions could be answered by querying a very simple table with minimal I/O overhead. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Simple query showing 270 hours of CPU time
Tom Lane wrote: Dan Harris <[EMAIL PROTECTED]> writes: Here's the strace summary as run for a few second sample: % time seconds usecs/call callserrors syscall -- --- --- - - 97.250.671629 92 7272 semop 1.760.012171 40630 recvfrom 0.570.003960 6660 gettimeofday 0.360.002512 2890 sendto 0.050.000317 1032 lseek 0.010.49 148 select -- --- --- - - 100.000.690638 7532 total Here's the query: select id from eventkeywords where word = '3322' How sure are you that (a) that's really what it's doing and (b) you are not observing multiple executions of the query? There are no recvfrom calls in the inner loops of the backend AFAIR, so this looks to me like the execution of 30 different queries. The number of semops is distressingly high, but that's a contention issue not an amount-of-runtime issue. You were absolutely right. This is one connection that is doing a whole lot of ( slow ) queries. I jumped the gun on this and assumed it was a single query taking this long. Sorry to waste time and bandwidth. Since you mentioned the number of semops is distressingly high, does this indicate a tuning problem? The machine has 64GB of RAM and as far as I can tell about 63GB is all cache. I wonder if this is a clue to an undervalued memory-related setting somewhere? -Dan ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Simple query showing 270 hours of CPU time
Today, I looked at 'top' on my PG server and saw a pid that reported 270 hours of CPU time. Considering this is a very simple query, I was surprised to say the least. I was about to just kill the pid, but I figured I'd try and see exactly what it was stuck doing for so long. Here's the strace summary as run for a few second sample: % time seconds usecs/call callserrors syscall -- --- --- - - 97.250.671629 92 7272 semop 1.760.012171 40630 recvfrom 0.570.003960 6660 gettimeofday 0.360.002512 2890 sendto 0.050.000317 1032 lseek 0.010.49 148 select -- --- --- - - 100.000.690638 7532 total Here's the query: select id from eventkeywords where word = '3322' If I run the query manually, it completes in about 500ms, which is very reasonable. There are 408563 rows in this table. I just noticed there is no index on word ( there should be! ). Would this have caused the problem? This is 8.0.12 Linux sunrise 2.6.15-26-amd64-server #1 SMP Fri Sep 8 20:33:15 UTC 2006 x86_64 GNU/Linux Any idea what might have set it into this loop? -Dan ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] importance of fast disks with pg
Thomas Finneid wrote: Hi During the somes I did I noticed that it does not necessarily seem to be true that one needs the fastest disks to have a pg system that is fast. It seems to me that its more important to: - choose the correct methods to use for the operation - tune the pg memory settings - tune/disable pg xlog/wal etc It also seems to me that fast disks are more important for db systems of the OLTP type applications with real concurrency of both readers and writes across many, possibly larger, tables etc. Are the above statements close to having any truth in them? regards thomas I'd say that "it depends". We run an OLAP workload on 350+ gigs of database on a system with 64GB of RAM. I can tell you for certain that fetching non-cached data is very sensitive to disk throughput! Different types of workloads will find different bottlenecks in the system.. -Dan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] [ADMIN] reclaiming disk space after major updates
Andrew Sullivan wrote: On Thu, Jun 07, 2007 at 03:26:56PM -0600, Dan Harris wrote: They don't always have to be in a single transaction, that's a good idea to break it up and vacuum in between, I'll consider that. Thanks If you can do it this way, it helps _a lot_. I've had to do this sort of thing, and breaking into groups of a couple thousand or so really made the difference. A One more point in my original post.. For my own education, why does VACUUM FULL prevent reads to a table when running (I'm sure there's a good reason)? I can certainly understand blocking writes, but if I could still read from it, I'd have no problems at all! -Dan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Seq Scan
Tyler Durden wrote: Hi, I'm having some problems in performance in a simple select count(id) from I have 700 000 records in one table, and when I do: # explain select (id) from table_name; -[ RECORD 1 ] QUERY PLAN | Seq Scan on table_name (cost=0.00..8601.30 rows=266730 width=4) I had created an index for id(btree), but still shows "Seq Scan". What I'm doing wrong? Thanks, Tyler ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org select count(*) will *always* do a sequential scan, due to the MVCC architecture. See archives for much discussion about this. -Dan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Background vacuum
Daniel Haensse wrote: Dear list, I'm running postgres on a tomcat server. The vacuum is run every hour (cronjob) which leads to a performance drop of the tomcat applications. I played around with renice command and I think it is possible to reduce this effect which a renice. The problem is how can I figure out the PID of the postmaster performing the vacuum(automated)? Has anybody a nice solution to change process priority? A shell script, maybe even for java? While this may technically work, I think it lacks a key point. 'nice' ( at least the versions I'm familiar with ) do not adjust I/O priority. VACUUM is bogging things down because of the extra strain on I/O. CPU usage shouldn't really be much of a factor. Instead, I would recommend looking at vacuum_cost_delay and the related settings to make vacuum lower priority than the queries you care about. This should be a cleaner solution for you. -Dan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM]
Orhan Aglagul wrote: Hi Everybody, I was trying to see how many inserts per seconds my application could handle on various machines. I read that postgres does have issues with MP Xeon (costly context switching). But I still think that with fsync=on 65 seconds is ridiculous. CPU is unlikely your bottleneck.. You failed to mention anything about your I/O setup. More details in this regard will net you better responses. However, an archive search for insert performance will probably be worthwhile, since this type of question is repeated about once a month. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Feature Request --- was: PostgreSQL Performance Tuning
Bill Moran wrote: In response to Dan Harris <[EMAIL PROTECTED]>: Why does the user need to manually track max_fsm_pages and max_fsm_relations? I bet there are many users who have never taken the time to understand what this means and wondering why performance still stinks after vacuuming their database ( spoken from my own experience ) But there are two distinct routes that can be taken if there's not enough fsm space: add fsm space or vacuum more frequently. I don't want the system to eat up a bunch of memory for fsm entries if my workload indicates that I can easily vacuum more frequently. There's no magic bullet here, but heuristics should be able to tell us you can "easily vacuum more frequently" And again, I said these things would be *optional*. Like an item in postgresql.conf "i_have_read_the_manual_and_know_what_this_all_means = false #default false". If you change it to true, you have all the control you're used to and nothing will get in your way. How about work_mem? shared_buffers? column statistics sizes? random_page_cost? The only one that seems practical (to me) is random_page_cost. The others are all configuration options that I (as a DBA) want to be able to decide for myself. For example, I have some dedicated PG servers that I pretty much max those values out at, to let PG know that it can use everything on the system -- but I also have some shared use machines with PG, where I carefully constrain those values so that PG doesn't muscle other daemons out of their share of the RAM (work_mem is probably the best example) Just because you carefully constrain it does not preclude the ability for program logic to maintain statistics to do what I suggested. It would be nice to have some kind of utility that could tell me what random_page_cost should be, as I've never felt comfortable tweaking it. Like some utility to run that would say "based on the seek tests I just ran, you should set random_page_cost to x". Of course, if such a thing existed, it could just fill in the value for you. But I haven't figured out how to pick a good value for that setting, so I have no idea how to suggest to have it automatically set. Me either, but I thought if there's a reason it's user-settable, there must be some demonstrable method for deciding what is best. Couldn't some fairly simple regression tests akin to a VACUUM process spot potential problems? "Hey, it looks like you need more fsm_relations.. I bumped that up automatically for you". Or "These indexes look bloated, shall I automatically reindex them for you?" A lot of that stuff does happen. A vacuum verbose will tell you what it thinks you should do, but I don't _want_ it to do it automatically. What if I create huge temporary tables once a week for some sort of analysis that overload the fsm space? And if I'm dropping those tables when the analysis is done, do I want the fsm space constantly adjusting? I understand *you* don't want it done automatically. But my suspicion is that there are a lot more newbie pg admins who would rather let the system do something sensible as a default. Again, you sound defensive that somehow my ideas would take power away from you. I'm not sure why that is, but certainly I'm not suggesting that. An auto-pilot mode is not a bad idea just because a few pilots don't want to use it. Plus, some is just impossible. shared_buffers requires a restart. Do you want your DB server spontaneously restarting because it thought more buffers might be nice? Well, maybe look at the bigger picture and see if it can be fixed to *not* require a program restart? Or.. take effect on the next pid that gets created? This is a current limitation, but doesn't need to be one for eternity does it? I'm sure there are many more examples, that with some creative thinking, could be auto-adjusted to match the usage patterns of the database. PG does an excellent job of exposing the variables to the users, but mostly avoids telling the user what to do or doing it for them. Instead, it is up to the user to know where to look, what to look for, and how to react to things to improve performance. This is not all bad, but it is assuming that all users are hackers ( which used to be true ), but certainly doesn't help when the average SQLServer admin tries out Postgres and then is surprised at the things they are now responsible for managing. PG is certainly *not* the only database to suffer from this syndrome, I know.. I expect the suffering is a result of the fact that databases are non-trivial pieces of software, and there's no universally simple way to set them up and make them run well. Speaking as a former SQL Server admin ( from day 1 of the Sybase fork up to version 2000 ), I can say there *is* a way to
Re: [PERFORM] Feature Request --- was: PostgreSQL Performance Tuning
Michael Stone wrote: On Fri, Apr 27, 2007 at 09:27:49AM -0400, Carlos Moreno wrote: Notice that the second part of my suggestion covers this --- have additional switches to initdb If the person knows all that, why wouldn't they know to just change the config parameters? Exactly.. What I think would be much more productive is to use the great amount of information that PG tracks internally and auto-tune the parameters based on it. For instance: Why does the user need to manually track max_fsm_pages and max_fsm_relations? I bet there are many users who have never taken the time to understand what this means and wondering why performance still stinks after vacuuming their database ( spoken from my own experience ) How about work_mem? shared_buffers? column statistics sizes? random_page_cost? Couldn't some fairly simple regression tests akin to a VACUUM process spot potential problems? "Hey, it looks like you need more fsm_relations.. I bumped that up automatically for you". Or "These indexes look bloated, shall I automatically reindex them for you?" I'm sure there are many more examples, that with some creative thinking, could be auto-adjusted to match the usage patterns of the database. PG does an excellent job of exposing the variables to the users, but mostly avoids telling the user what to do or doing it for them. Instead, it is up to the user to know where to look, what to look for, and how to react to things to improve performance. This is not all bad, but it is assuming that all users are hackers ( which used to be true ), but certainly doesn't help when the average SQLServer admin tries out Postgres and then is surprised at the things they are now responsible for managing. PG is certainly *not* the only database to suffer from this syndrome, I know.. I like to think of my systems as good employees. I don't want to have to micromanage everything they do. I want to tell them "here's what I want done", and assuming I made a good hiring choice, they will do it and take some liberty to adjust parameters where needed to achieve the spirit of the goal, rather than blindly do something inefficiently because I failed to explain to them the absolute most efficient way to accomplish the task. Granted, there are some people who don't like the developers making any assumptions about their workload. But this doesn't have to be an either/or proposition. I don't think any control needs to be abandoned. But self-adjusting defaults seem like an achievable goal ( I know, I know, "show us the patch" ). I just don't know if this feeling has resonated well between new users and long-term developers. I know it must be grating to have to answer the same questions over and over and over "have you analyzed? Did you leave postgresql.conf at the defaults??". Seems like a win-win for both sides, IMHO. In closing, I am not bashing PG! I love it and swear by it. These comments are purely from an advocacy perspective. I'd love to see PG user base continue to grow. My .02 -Dan ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Finding bloated indexes?
Is there a pg_stat_* table or the like that will show how bloated an index is? I am trying to squeeze some disk space and want to track down where the worst offenders are before performing a global REINDEX on all tables, as the database is rougly 400GB on disk and this takes a very long time to run. I have been able to do this with tables, using a helpful view posted to this list a few months back, but I'm not sure if I can get the same results on indexes. Thanks -Dan ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Planner doing seqscan before indexed join
8.0.3 - Linux 2.6.18.. Freshly vacuumed and analyzed This database has been humming along fine for a while now, but I've got one of those sticky queries that is taking much too long to finish. After some digging, I've found that the planner is choosing to apply a necessary seq scan to the table. Unfortunately, it's scanning the whole table, when it seems that it could have joined it to a smaller table first and reduce the amount of rows it would have to scan dramatically ( 70 million to about 5,000 ). The table "eventactivity" has about 70million rows in it, index on "incidentid" The table "keyword_incidents" is a temporary table and has incidentid as its primary key. It contains 5125 rows. eventmain and eventgeo both have 2.1 million. My hope is that I can convince the planner to do the join to keyword_incidents *first* and then do the seq scan for the LIKE condition. Instead, it seems that it's seqscanning the whole 70 million rows first and then doing the join, which takes a lot longer than I'd like to wait for it. Or, maybe I'm misreading the explain output? Thanks again -Dan - Here's the query: explain analyze select * from keyword_incidents, eventactivity, eventmain, eventgeo where eventmain.incidentid = keyword_incidents.incidentid and eventgeo.incidentid = keyword_incidents.incidentid and eventactivity.incidentid = keyword_incidents.incidentid and ( recordtext like '%JOSE CHAVEZ%' ) order by eventmain.entrydate limit 1; --- Limit (cost=2388918.07..2388918.08 rows=1 width=455) (actual time=81771.186..81771.292 rows=26 loops=1) -> Sort (cost=2388918.07..2388918.08 rows=1 width=455) (actual time=81771.180..81771.215 rows=26 loops=1) Sort Key: eventmain.entrydate -> Nested Loop (cost=0.00..2388918.06 rows=1 width=455) (actual time=357.389..81770.982 rows=26 loops=1) -> Nested Loop (cost=0.00..2388913.27 rows=1 width=230) (actual time=357.292..81767.385 rows=26 loops=1) -> Nested Loop (cost=0.00..2388909.33 rows=1 width=122) (actual time=357.226..81764.501 rows=26 loops=1) -> Seq Scan on eventactivity (cost=0.00..2388874.46 rows=7 width=84) (actual time=357.147..81762.582 rows=27 loops=1) Filter: ((recordtext)::text ~~ '%JOSE CHAVEZ%'::text) -> Index Scan using keyword_incidentid_pkey on keyword_incidents (cost=0.00..4.97 rows=1 width=38) (actual time=0.034..0.036 rows=1 loops=27) Index Cond: (("outer".incidentid)::text = (keyword_incidents.incidentid)::text) -> Index Scan using eventgeo_incidentid_idx on eventgeo (cost=0.00..3.93 rows=1 width=108) (actual time=0.076..0.081 rows=1 loops=26) Index Cond: (("outer".incidentid)::text = (eventgeo.incidentid)::text) -> Index Scan using eventmain_incidentid_idx on eventmain (cost=0.00..4.78 rows=1 width=225) (actual time=0.069..0.075 rows=1 loops=26) Index Cond: (("outer".incidentid)::text = (eventmain.incidentid)::text) Total runtime: 81771.529 ms (15 rows) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Determining server load from client
Dan Harris wrote: I've found that it would be helpful to be able to tell how busy my dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before pounding it with some OLAP-type queries. ..snip Thank you all for your great ideas! I'm going to try the perl function as that seems like a very elegant way of doing it. -Dan ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Determining server load from client
I've found that it would be helpful to be able to tell how busy my dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before pounding it with some OLAP-type queries. Specifically, I have a multi-threaded client program that needs to run several thousand sequential queries. I broke it into threads to take advantage of the multi-core architecture of the server hardware. It would be very nice if I could check the load of the server at certain intervals to throttle the number of concurrent queries and mitigate load problems when other processes might be already inducing a significant load. I have seen some other nice back-end things exposed through PG functions ( e.g. database size on disk ) and wondered if there was anything applicable to this. Even if it can't return the load average proper, is there anything else in the pg_* tables that might give me a clue how "busy" the server is for a period of time? I've thought about allowing an ssh login without a keyphrase to log in and capture it. But, the client process is running as an apache user. Giving the apache user a shell login to the DB box does not seem like a smart idea for obvious security reasons... So far, that's all I can come up with, other than a dedicated socket server daemon on the DB machine to do it. Any creative ideas are welcomed :) Thanks -Dan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] General advice on user functions
Thank you all for your ideas. I appreciate the quick response. -Dan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] General advice on user functions
I have a new task of automating the export of a very complex Crystal Report. One thing I have learned in the last 36 hours is that the export process to PDF is really, really, slooww.. Anyway, that is none of your concern. But, I am thinking that I can somehow utilize some of PG's strengths to work around the bottleneck in Crystal. The main problem seems to be that tens of thousands of rows of data must be summarized in the report and calculations made. Based on my recent experience, I'd say that this task would be better suited to PG than relying on Crystal Reports to do the summarizing. The difficulty I'm having is that the data needed is from about 50 different "snapshots" of counts over time. The queries are very simple, however I believe I am going to need to combine all of these queries into a single function that runs all 50 and then returns just the count(*) of each as a separate "column" in a single row. I have been Googling for hours and reading about PL/pgsql functions in the PG docs and I have yet to find examples that returns multiple items in a single row. I have seen cases that return "sets of", but that appears to be returning multiple rows, not columns. Maybe this I'm barking up the wrong tree? Here's the gist of what I need to do: 1) query count of rows that occurred between 14 months ago and 12 months ago for a given criteria, then count the rows that occurred between 2 months ago and current. Repeat for 50 different where clauses. 2) return each count(*) as a "column" so that in the end I can say: select count_everything( ending_date ); and have it return to me: count_a_lastyear count_a_last60count_b_lastyearcount_b_last60 ---- 100150 200 250 I'm not even sure if a function is what I'm after, maybe this can be done in a view? I am embarrassed to ask something that seems like it should be easy, but some key piece of knowledge is escaping me on this. I don't expect someone to write this for me, I just need a nudge in the right direction and maybe a URL or two to get me started. Thank you for reading this far. -Dan ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] quad or dual core Intel CPUs
Arjen van der Meijden wrote: But be aware that there can be substantial and unexpected differences on this relatively new platform due to simply changing the OS, like we saw when going from linux 2.6.15 to 2.6.18, as you can see here: http://tweakers.net/reviews/657/2 Having upgraded to 2.6.18 fairly recently, I am *very* interested in what caused the throughput to drop in 2.6.18? I haven't done any benchmarking on my system to know if it affected my usage pattern negatively, but I am curious if anyone knows why this happened? -Dan ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM]Is it possible to start two instances of postgresql?
[EMAIL PROTECTED] wrote: both of the two database are live but use for two different web app. my company don't want to spend more to buy a new server, so then I think of to implement both under the same server and one instance.. Just as an anecdote, I am running 30 databases on a single instance and it's working quite well. There may be reasons to run multiple instances but it seems like tuning them to cooperate for memory would pose some problems - e.g. effective_cache_size. -Dan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Killing long-running queries
Tom Lane wrote You should be using SIGINT, not SIGTERM. regards, tom lane Thank you very much for clarifying this point! It works :) ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Killing long-running queries
My database is used primarily in an OLAP-type environment. Sometimes my users get a little carried away and find some way to slip past the sanity filters in the applications and end up bogging down the server with queries that run for hours and hours. And, of course, what users tend to do is to keep queuing up more queries when they don't see the first one return instantly :) So, I have been searching for a way to kill an individual query. I read in the mailing list archives that you could 'kill' the pid. I've tried this a few times and more than once, it has caused the postmaster to die(!), terminating every query that was in process, even unrelated to that query. Is there some way I can just kill a query and not risk breaking everything else when I do it? Thanks ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Encouraging multi-table join order
Tom Lane wrote: What the stats entry is saying is that the most common entries occur about 75000 times apiece (0.0017 * 45e6), which is what's scaring the planner here ;-). I think those frequencies are artificially high though. The default statistics sample size is 3000 rows (300 * statistics target, actually), so those numbers correspond to 5 or 4 rows in the sample, which is probably just random chance. Try increasing the stats targets for this table to 100, then re-ANALYZE and see what you get. The most_common_freqs entries might drop as much as a factor of 10. regards, tom lane Tom: I believe this was the problem. I upped the statistics to 100, for a sample size of 30k and now the planner does the correct nested loop/index scan and takes only 30 seconds! This is a HUGE performance increase. I wonder why the estimates were so far off the first time? This table has been ANALYZED regularly ever since creation. Once again, thank you and all of the developers for your hard work on PostgreSQL. This is by far the most pleasant management experience of any database I've worked on. -Dan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Encouraging multi-table join order
Tom Lane wrote: So it's estimating 5775 cost units per probe into eventactivity, which is pretty high --- it must think that a lot of rows will be retrieved by the index (way more than the 20 or so it thinks will get past the filter condition). What does the pg_stats entry for eventactivity.incidentid contain? select * from pg_stats where tablename = 'eventactivity' and attname='incidentid'; schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation +---++---+---++-+-++- public | eventactivity | incidentid | 0 |14 | 8157 | {P043190299,P051560740,P052581036,P052830218,P053100679,P053190889,P060370845,P042070391,P042690319,P043290117} | {0.0017,0.0017,0.0017,0.0017,0.0017,0.0017,0.0017,0.0013,0.0013,0.0013} | {P022140319,P030471058,P033090308,P041961082,P042910689,P050311006,P051350254,P052261148,P053270945,P060240316,P061000287} |0.241737 It might be worth increasing the statistics target for that column to try to get a better estimate. How high should I set this? I read the default is 10, but I'm not sure if doubling this would make a difference or if I should be doing a much larger number. There's approx 45 million rows in the table, if that matters. Thanks again, Dan ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Encouraging multi-table join order
Tom Lane wrote: That's very strange --- the estimated cost of the seqscan is high enough that the planner should have chosen a nestloop with inner indexscan on the big table. I'm not sure about the join-order point, but the hash plan for the first join seems wrong in any case. Um, you do have an index on eventactivity.incidentid, right? What's the datatype(s) of the incidentid columns? What happens to the plan if you turn off enable_hashjoin and enable_mergejoin? regards, tom lane Yes, eventactivity.incidentid is indexed. The datatype is varchar(40). Although, by checking this, I noticed that k_h.incidentid was varchar(100). Perhaps the difference in length between the keys caused the planner to not use the fastest method? I have no defense as to why those aren't the same.. I will make them so and check. Here's the EXPLAIN analyze with enable_hashjoin = off and enable_mergejoin = off : Limit (cost=4226535.73..4226544.46 rows=698 width=82) (actual time=74339.016..74356.521 rows=888 loops=1) -> Unique (cost=4226535.73..4226544.46 rows=698 width=82) (actual time=74339.011..74354.073 rows=888 loops=1) -> Sort (cost=4226535.73..4226537.48 rows=698 width=82) (actual time=74339.003..74344.031 rows=3599 loops=1) Sort Key: eventmain.entrydate, eventmain.incidentid, eventgeo.eventlocation, eventactivity.recordtext -> Nested Loop (cost=0.00..4226502.76 rows=698 width=82) (actual time=921.325..74314.959 rows=3599 loops=1) -> Nested Loop (cost=0.00..4935.61 rows=731 width=72) (actual time=166.354..14638.308 rows=1162 loops=1) -> Nested Loop (cost=0.00..2482.47 rows=741 width=50) (actual time=150.396..7348.013 rows=1162 loops=1) -> Index Scan using k_h_id_idx on k_h (cost=0.00..217.55 rows=741 width=14) (actual time=129.540..1022.243 rows=1162 loops=1) Index Cond: (id = 33396) Filter: ((entrydate >= '2006-01-01 00:00:00'::timestamp without time zone) AND (entrydate < '2006-04-08 00:00:00'::timestamp without time zone)) -> Index Scan using eventgeo_incidentid_idx on eventgeo (cost=0.00..3.04 rows=1 width=36) (actual time=5.260..5.429 rows=1 loops=1162) Index Cond: ((eventgeo.incidentid)::text = ("outer".incidentid)::text) -> Index Scan using eventmain_incidentid_idx on eventmain (cost=0.00..3.30 rows=1 width=22) (actual time=5.976..6.259 rows=1 loops=1162) Index Cond: ((eventmain.incidentid)::text = ("outer".incidentid)::text) -> Index Scan using eventactivity1 on eventactivity (cost=0.00..5774.81 rows=20 width=52) (actual time=29.768..51.334 rows=3 loops=1162) Index Cond: (("outer".incidentid)::text = (eventactivity.incidentid)::text) Filter: ' '::text || (recordtext)::text) || ' '::text) ~~ '%HAL%'::text) AND (entrydate >= '2006-01-01 00:00:00'::timestamp without time zone) AND (entrydate < '2006-04-08 00:00:00'::timestamp without time zone)) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Encouraging multi-table join order
I have a query that is intended to select from multiple "small tables" to get a limited subset of "incidentid" and then join with a "very large" table. One of the operations will require a sequential scan, but the planner is doing the scan on the very large table before joining the small ones, resulting in a huge amount of disk I/O. How would I make this query join the large table only after narrowing down the possible selections from the smaller tables? This is running on version 8.0.3. Thanks for any ideas. -Dan QUERY explain analyze select distinct eventmain.incidentid, eventmain.entrydate, eventgeo.eventlocation, recordtext as retdata from eventactivity join ( select incidentid from k_h where id = 33396 and k_h.entrydate >= '2006-1-1 00:00' and k_h.entrydate < '2006-4-8 00:00' ) id_keywords using ( incidentid ) , eventmain, eventgeo where eventmain.incidentid = eventactivity.incidentid and eventmain.incidentid = eventgeo.incidentid and ( ' ' || recordtext || ' ' like '%HAL%' ) and eventactivity.entrydate >= '2006-1-1 00:00' and eventactivity.entrydate < '2006-4-8 00:00' order by eventmain.entrydate limit 1; EXPLAIN ANALYZE OUTPUT Limit (cost=2521191.65..2521191.90 rows=6 width=187) (actual time=1360935.787..1361072.277 rows=1400 loops=1) -> Unique (cost=2521191.65..2521191.90 rows=6 width=187) (actual time=1360935.779..1361067.853 rows=1400 loops=1) -> Sort (cost=2521191.65..2521191.66 rows=6 width=187) (actual time=1360935.765..1360958.258 rows=16211 loops=1) Sort Key: eventmain.entrydate, eventmain.incidentid, eventactivity.recordtext, eventgeo.eventlocation -> Nested Loop (cost=219.39..2521191.57 rows=6 width=187) (actual time=1123.115..1360579.798 rows=16211 loops=1) -> Nested Loop (cost=219.39..2521173.23 rows=6 width=154) (actual time=1105.773..1325907.716 rows=16211 loops=1) -> Hash Join (cost=219.39..2521153.37 rows=6 width=66) (actual time=1069.476..1289608.261 rows=16211 loops=1) Hash Cond: (("outer".incidentid)::text = ("inner".incidentid)::text) -> Seq Scan on eventactivity (cost=0.00..2518092.06 rows=1532 width=52) (actual time=57.205..1288514.530 rows=2621 loops=1) Filter: ' '::text || (recordtext)::text) || ' '::text) ~~ '%HAL%'::text) AND (entrydate >= '2006-01-01 00:00:00'::timestamp without time zone) AND (entrydate < '2006-04-08 00:00:00'::timestamp without time zone)) -> Hash (cost=217.53..217.53 rows=741 width=14) (actual time=899.128..899.128 rows=0 loops=1) -> Index Scan using k_h_id_idx on k_h (cost=0.00..217.53 rows=741 width=14) (actual time=55.097..893.883 rows=1162 loops=1) Index Cond: (id = 33396) Filter: ((entrydate >= '2006-01-01 00:00:00'::timestamp without time zone) AND (entrydate < '2006-04-08 00:00:00'::timestamp without time zone)) -> Index Scan using eventmain_incidentid_idx on eventmain (cost=0.00..3.30 rows=1 width=88) (actual time=1.866..2.227 rows=1 loops=16211) Index Cond: ((eventmain.incidentid)::text = ("outer".incidentid)::text) -> Index Scan using eventgeo_incidentid_idx on eventgeo (cost=0.00..3.04 rows=1 width=75) (actual time=1.770..2.126 rows=1 loops=16211) Index Cond: ((eventgeo.incidentid)::text = ("outer".incidentid)::text) Total runtime: 1361080.787 ms (19 rows) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] vacuum full seems to hang on very small table
I have a table with 1 live row that I found has 115000 dead rows in it ( from a testing run ). I'm trying to VACUUM FULL the table and it has run for over 18 hours without completion. Considering the hardware on this box and the fact that performance seems reasonable in all other aspects, I'm confused as to why this would happen. The database other than this table is quite large ( 70 gigs on disk ) and I would expect to take days to complete but I just did 'vacuum full table_stats'. That should only do that table, correct? I'm running 8.0.3. Table "public.table_stats" Column|Type | Modifiers -+-+--- count_cfs | integer | count_ncfs | integer | count_unitactivity | integer | count_eventactivity | integer | min_eventmain | timestamp without time zone | max_eventmain | timestamp without time zone | min_eventactivity | timestamp without time zone | max_eventactivity | timestamp without time zone | geocoding_hitrate | double precision| recent_load | timestamp without time zone | count_eventmain | integer | This is the table structure. Any ideas where to begin troubleshooting this? Thanks. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Help optimizing a slow index scan
Merlin Moncure wrote: As others will probably mention, effective queries on lot/long which is a spatial problem will require r-tree or gist. I don't have a lot of experience with exotic indexes but this may be the way to go. One easy optimization to consider making is to make an index on either (incidentid, entrydate) or (incident_id,long) which ever is more selective. This is 'yet another query' that would be fun to try out and tweak using the 8.2 upcoming row-wise comparison. merlin Thanks to everyone for your suggestions. One problem I ran into is that apparently my version doesn't support the GIST index that was mentioned. "function 'box' doesn't exist" ).. So I'm guessing that both this as well as the Earth Distance contrib require me to add on some more pieces that aren't there. Furthermore, by doing so, I am tying my queries directly to "postgres-isms". One of the long term goals of this project is to be able to fairly transparently support any ANSI SQL-compliant back end with the same code base. If I had full control over the query designs, I could make stored procedures to abstract this. However, I have to deal with a "gray box" third-party reporting library that isn't so flexible. I'll certainly consider going with something postgre-specific, but only as a last resort. I tried the multi-column index as mentioned above but didn't see any noticeable improvement in elapsed time, although the planner did use the new index. What is the real reason for the index not being very effective on these columns? Although the numbers are in a very limited range, it seems that the records would be very selective as it's not terribly common for multiple rows to share the same coords. Is the "8.2. upcoming row-wise comparison" something that would be likely to help me? Thanks again for your input ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Help optimizing a slow index scan
Dan Harris wrote: Markus Bertheau wrote: Have you tried using a GIST index on lat & long? These things are meant for two-dimensional data, whereas btree doesn't handle two-dimensional data that well. How many rows satisfy either of the long / lat condition? According to the analyze, less than 500 rows matched. I'll look into GIST indexes, thanks for the feedback. -Dan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster When I try to create a GIST index, I get the following error: create index eventgeo_lat_idx on eventgeo using GIST (lat); ERROR: data type double precision has no default operator class for access method "gist" HINT: You must specify an operator class for the index or define a default operator class for the data type. I'm not sure what a "default operator class" is, exactly.. -Dan ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Help optimizing a slow index scan
Markus Bertheau wrote: Have you tried using a GIST index on lat & long? These things are meant for two-dimensional data, whereas btree doesn't handle two-dimensional data that well. How many rows satisfy either of the long / lat condition? According to the analyze, less than 500 rows matched. I'll look into GIST indexes, thanks for the feedback. -Dan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Help optimizing a slow index scan
Dan Harris wrote: explain analyze doh.. sorry to reply to my own post. But I messed up copying some of the fields into the select statement that you'll see in the "Sort Key" section of the analyze results. The mistake was mine. Everything else is "normal" between the query and the plan. -Dan ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Help optimizing a slow index scan
explain analyze select distinct eventmain.incidentid, eventmain.entrydate, eventgeo.long, eventgeo.lat, eventgeo.geox, eventgeo.geoy from eventmain, eventgeo where eventmain.incidentid = eventgeo.incidentid and ( long > -104.998027962962 and long < -104.985957781349 ) and ( lat > 39.7075542720006 and lat < 39.7186195832938 ) and eventmain.entrydate > '2006-1-1 00:00' and eventmain.entrydate <= '2006-3-17 00:00' order by eventmain.entrydate; QUERY PLAN - Unique (cost=121313.81..121330.72 rows=451 width=178) (actual time=723719.761..723726.875 rows=1408 loops=1) -> Sort (cost=121313.81..121314.94 rows=451 width=178) (actual time=723719.755..723721.807 rows=1408 loops=1) Sort Key: eventmain.entrydate, eventmain.disposition, eventmain.incidentid, eventgeo.reportingarea, eventgeo.beatid, eventmain.finaltype, eventmain.casenumber, eventgeo.eventlocation, eventmain.insertdate, eventmain.priority, eventgeo.long, eventgeo.lat, eventgeo.geox, eventgeo.geoy -> Nested Loop (cost=0.00..121293.93 rows=451 width=178) (actual time=1916.230..723712.900 rows=1408 loops=1) -> Index Scan using eventgeo_lat_idx on eventgeo (cost=0.00..85488.05 rows=10149 width=76) (actual time=0.402..393376.129 rows=22937 loops=1) Index Cond: ((lat > 39.7075542720006::double precision) AND (lat < 39.7186195832938::double precision)) Filter: ((long > -104.998027962962::double precision) AND (long < -104.985957781349::double precision)) -> Index Scan using eventmain_incidentid_idx on eventmain (cost=0.00..3.52 rows=1 width=119) (actual time=14.384..14.392 rows=0 loops=22937) Index Cond: ((eventmain.incidentid)::text = ("outer".incidentid)::text) Filter: ((entrydate > '2006-01-01 00:00:00'::timestamp without time zone) AND (entrydate <= '2006-03-17 00:00:00'::timestamp without time zone)) Total runtime: >>> 723729.238 ms(!) <<< I'm trying to figure out why it's consuming so much time on the index scan for eventgeo_lat_idx. Also, I have an index on "long" that the planner does not appear to find helpful. There are 3.3 million records in eventmain and eventgeo. The server has a reasonably fast RAID10 setup with 16x 15k RPM drives and 12GB of RAM ( 11GB listed as "cache" by vmstat ). Running version 8.0.2 on linux kernel 2.6.12. I have just vacuum analyze'd both tables, rebuilt the eventgeo_lat_idx index and reran the query multiple times to see if caching helped ( it didn't help much ). The server seems to be fine utilizing other fields from this table but using "long" and "lat" seem to drag it down significantly. Is it because there's such slight differences between the records, since they are all within a few hundredths of a degree from each other? Thanks for your time and ideas. -Dan ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Ultra-cheap NVRAM device
On Oct 3, 2005, at 5:02 AM, Steinar H. Gunderson wrote: I thought this might be interesting, not the least due to the extremely low price ($150 + the price of regular DIMMs): Replying before my other post came through.. It looks like their benchmarks are markedly improved since the last article I read on this. There may be more interest now.. -Dan ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Ultra-cheap NVRAM device
On Oct 3, 2005, at 5:02 AM, Steinar H. Gunderson wrote: I thought this might be interesting, not the least due to the extremely low price ($150 + the price of regular DIMMs): This has been posted before, and the main reason nobody got very excited is that: a) it only uses the PCI bus to provide power to the device, not for I/O b) It is limited to SATA bandwidth c) The benchmarks did not prove it to be noticeably faster than a good single SATA drive A few of us were really excited at first too, until seeing the benchmarks.. -Dan ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Monitoring Postgresql performance
On Sep 28, 2005, at 8:32 AM, Arnau wrote: Hi all, I have been "googling" a bit searching info about a way to monitor postgresql (CPU & Memory, num processes, ... ) You didn't mention your platform, but I have an xterm open pretty much continuously for my DB server that runs plain old top. I have customized my settings enough that I can pretty much see anything I need to from there. -Dan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Poor performance on HP Package Cluster
Do you have any sources for that information? I am running dual SmartArray 6402's in my DL585 and haven't noticed anything poor about their performance. On Sep 1, 2005, at 2:24 PM, Luke Lonergan wrote: Are you using the built-in HP SmartArray RAID/SCSI controllers? If so, that could be your problem, they are known to have terrible and variable performance with Linux. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] extremly low memory usage
On Aug 19, 2005, at 3:01 PM, Jeremiah Jahn wrote: Rebuild in progress with just ext3 on the raid array...will see if this helps the access times. From my recent experiences, I can say ext3 is probably not a great choice for Pg databases. If you check the archives you'll see there's a lot of discussion about various journalling filesystems and ext3 usually(always?) comes up on the bottom as far as performance goes. If you insist on using it, I would at least recommend the noatime option in fstab and using data=writeback to get the faster of the journal modes. XFS seems to be a trusted choice, followed by Reiser and JFS both with the occasional controversy when the comparisons pop up. -Dan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Query plan looks OK, but slow I/O - settings advice?
On Aug 19, 2005, at 12:55 AM, Jeffrey W. Baker wrote: On Tue, 2005-08-16 at 10:46 -0700, Roger Hand wrote: Have you considered booting your machine with elevator=deadline? Although I'm not the OP for this problem, I thought I'd try it out. WOW.. this should be in a Pg tuning guide somewhere. I added this to my server tonight just for kicks and saw a pronounced improvement in IO performance. Thank you very much for mentioning this on the list. I didn't have a long enough maintenance window to do solid benchmarking, but I can say for certain that the change was noticeable, especially in VACUUM operations. Specs for the server: PG 8.0.1 Linux 2.6.12-3 kernel 4xOpteron 2.2 12GB RAM 16-drive RAID 10 XFS mounted with noatime pg_xlog on separate RAID controller -Dan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Speedier count(*)
Thanks for all the great ideas. I have more options to evaluate now. -Dan ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Speedier count(*)
I have a web page for my customers that shows them count of records and some min/max date ranges in each table of a database, as this is how we bill them for service. They can log in and check the counts at any time. I'd like for the counts to be as fresh as possible by keeping this dynamic, but I will use a periodic 'snapshot'/cron job if that is the only option to speed this up. I have thought about using the table statistics, but the estimate error is probably unacceptable because of the billing purposes. For some reason, the SQL Server we migrated the app from can return count(*) in a split second on multi-million row tables, even though it is a MUCH slower box hardware-wise, but it's now taking many seconds to run. I have read in the archives the problems MVCC brings into the count(*) dilemma forcing Pg to run a seq scan to get counts. Does SQLServer not use MVCC or have they found another approach for arriving at this number? Compounding all the min/max and counts from other tables and all those queries take about a minute to run. The tables will contain anywhere from 1 million to 40 million rows. Also, I am using "select ... group by ... order by .. limit 1" to get the min/max since I have already been bit by the issue of min() max() being slower. -Dan ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Table locking problems?
On Aug 9, 2005, at 3:51 PM, John A Meinel wrote: Dan Harris wrote: On Aug 10, 2005, at 12:49 AM, Steve Poe wrote: Dan, Do you mean you did RAID 1 + 0 (RAID 10) or RAID 0 + 1? Just a clarification, since RAID 0 is still a single-point of failure even if RAID1 is on top of RAID0. Well, you tell me if I stated incorrectly. There are two raid enclosures with 7 drives in each. Each is on its own bus on a dual- channel controller. Each box has a stripe across its drives and the enclosures are mirrors of each other. I understand the controller could be a single point of failure, but I'm not sure I understand your concern about the RAID structure itself. In this configuration, if you have a drive fail on both controllers, the entire RAID dies. Lets label them A1-7, B1-7, because you stripe within a set, if a single one of A dies, and a single one of B dies, you have lost your entire mirror. The correct way of doing it, is to have A1 be a mirror of B1, and then stripe above that. Since you are using 2 7-disk enclosures, I'm not sure how you can do it well, since it is not an even number of disks. Though if you are using software RAID, there should be no problem. The difference is that in this scenario, *all* of the A drives can die, and you haven't lost any data. The only thing you can't lose is a matched pair (eg losing both A1 and B1 will cause complete data loss) I believe the correct notation for this last form is RAID 1 + 0 (RAID10) since you have a set of RAID1 drives, with a RAID0 on-top of them. I have read up on the difference now. I don't understand why it's a "single point of failure". Technically any array could be a "single point" depending on your level of abstraction. In retrospect, I probably should have gone 8 drives in each and used RAID 10 instead for the better fault-tolerance, but it's online now and will require some planning to see if I want to reconfigure that in the future. I wish HP's engineer would have promoted that method instead of 0+1.. -Dan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Table locking problems?
On Aug 10, 2005, at 12:49 AM, Steve Poe wrote: Dan, Do you mean you did RAID 1 + 0 (RAID 10) or RAID 0 + 1? Just a clarification, since RAID 0 is still a single-point of failure even if RAID1 is on top of RAID0. Well, you tell me if I stated incorrectly. There are two raid enclosures with 7 drives in each. Each is on its own bus on a dual- channel controller. Each box has a stripe across its drives and the enclosures are mirrors of each other. I understand the controller could be a single point of failure, but I'm not sure I understand your concern about the RAID structure itself. How many users are connected when your update / delete queries are hanging? Have you done an analyze verbose on those queries? Most of the traffic is from programs we run to do analysis of the data and managing changes. At the time I noticed it this morning, there were 10 connections open to the database. That rarely goes above 20 concurrent. As I said in my other response, I believe that the log will only contain the query at the point the query finishes, so if it never finishes... Have you made changes to the postgresql.conf? kernel.vm settings? IO scheduler? I set shmmax appropriately for my shared_buffers setting, but that's the only kernel tweak. If you're not doing so already, you may consider running sar (iostat) to monitor when the hanging occurs if their is a memory / IO bottleneck somewhere. I will try that. Thanks ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Table locking problems?
On Aug 9, 2005, at 1:08 PM, Tom Lane wrote: "Joshua D. Drake" <[EMAIL PROTECTED]> writes: My experience is that when this type of thing happens it is typically specific queries that cause the problem. If you turn on statement logging you can get the exact queries and debug from there. Here are some things to look for: Is it a large table (and thus large indexes) that it is updating? Is the query using indexes? Is the query modifying ALOT of rows? Another thing to look at is foreign keys. Dan could be running into problems with an update on one side of an FK being blocked by locks on the associated rows on the other side. regards, tom lane Tom, Steve, Josh: Thank you for your ideas. The updates are only on a single table, no joins. I had stats collection turned off. I have turned that on again so that I can try and catch one while the problem is occurring. The last table it did this on was about 3 million records. 4 single-column indexes on it. The problem I had with statement logging is that if the query never finishes, it doesn't get logged as far as I can tell. So everything that did get logged was normal and would run with no isses in psql by copy and pasting it. The rows updated will certainly vary by query. I really need to "catch it in the act" with stats collection on so I can get the query from pg_stat_activity. Once I get it, I will play with explains and see if I can reproduce it outside the wild. Thanks again for your help. -Dan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Table locking problems?
I thought I would send this to pg-performance since so many people helped me with my speed issues recently. I was definitely IO- bottlenecked. Since then, I have installed 2 RAID arrays with 7 15k drives in them in RAID 0+1 as well as add a new controller card with 512MB of cache on it. I also created this new partition on the RAID as XFS instead of ext3. These changes have definitely improved performance, but I am now finding some trouble with UPDATE or DELETE queries "hanging" and never releasing their locks. As this happens, other statements queue up behind it. It seems to occur at times of very high loads on the box. Is my only option to kill the query ( which usually takes down the whole postmaster with it! ouch ). Could these locking issues be related to the other changes I made? I'm really scared that this is related to choosing XFS, but I sure hope not. How should I go about troubleshooting the "problem" queries? They don't seem to be specific to a single table or single database. I'm running 8.0.1 on kernel 2.6.12-3 on 64-bit Opterons if that matters.. -Dan ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Fwd: Help with view performance problem
On Jul 28, 2005, at 8:38 AM, Chris Hoover wrote: I did some more testing, and ran the explain analyze on the problem. In my session I did a set enable_hashjoin = false and then ran the analyze. This caused it to use the indexes as I have been expecting it to do. Now, how can I get it to use the indexes w/o manipulating the environment? What make postgresql want to sequentially scan and use a hash join? thanks, Chris explain analyze with set_hashjoin=false; prob_db=#explain analyze select * from clm_com; I had something similar to this happen recently. The planner was choosing a merge join and seq scan because my 'random_page_cost' was set too high. I had it at 3 , and ended up settling at 1.8 to get it to correctly use my indices. Once that change was in place, the planner did the 'right' thing for me. Not sure if this will help you, but it sounds similar. -Dan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] faster INSERT with possible pre-existing row?
I am working on a process that will be inserting tens of million rows and need this to be as quick as possible. The catch is that for each row I could potentially insert, I need to look and see if the relationship is already there to prevent multiple entries. Currently I am doing a SELECT before doing the INSERT, but I recognize the speed penalty in doing to operations. I wonder if there is some way I can say "insert this record, only if it doesn't exist already". To see if it exists, I would need to compare 3 fields instead of just enforcing a primary key. Even if this could be a small increase per record, even a few percent faster compounded over the whole load could be a significant reduction. Thanks for any ideas you might have. -Dan ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Coraid/AoE device experience?
Lately, I've been reading a lot about these new Coraid AoE RAID devices ( http://www.coraid.com ). They tout it as being fast and cheap and better than iSCSI due to the lack of TCP/IP over the wire. Is it likely that a 15-drive RAID 10 Linux software RAID would outperform a 4-drive 10k SCSI RAID 0+1 for a heavy-loaded database? If not software RAID, how about their dedicated RAID controller blade? I'm definitely IO bound right now and starving for spindles. Does this make sense or is it too good to be true? Thanks -Dan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Really bad diskio
On Jul 15, 2005, at 2:39 PM, Ron Wills wrote: Hello all I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and an 3Ware SATA raid. Operating System? Which file system are you using? I was having a similar problem just a few days ago and learned that ext3 was the culprit. -Dan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] slow joining very large table to smaller ones
On Jul 15, 2005, at 9:09 AM, Dan Harris wrote: On Jul 14, 2005, at 10:12 PM, John A Meinel wrote: My biggest question is why the planner things the Nested Loop would be so expensive. Have you tuned any of the parameters? It seems like something is out of whack. (cpu_tuple_cost, random_page_cost, etc...) here's some of my postgresql.conf. Feel free to blast me if I did something idiotic here. shared_buffers = 5 effective_cache_size = 1348000 random_page_cost = 3 work_mem = 512000 max_fsm_pages = 8 log_min_duration_statement = 6 fsync = true ( not sure if I'm daring enough to run without this ) wal_buffers = 1000 checkpoint_segments = 64 checkpoint_timeout = 3000 # FOR PG_AUTOVACUUM --# stats_command_string = true stats_row_level = true Sorry, I forgot to re-post my hardware specs. HP DL585 4 x 2.2 GHz Opteron 12GB RAM SmartArray RAID controller, 1GB hardware cache, 4x73GB 10k SCSI in RAID 0+1 ext2 filesystem Also, there are 30 databases on the machine, 27 of them are identical schemas. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] slow joining very large table to smaller ones
On Jul 14, 2005, at 10:12 PM, John A Meinel wrote: My biggest question is why the planner things the Nested Loop would be so expensive. Have you tuned any of the parameters? It seems like something is out of whack. (cpu_tuple_cost, random_page_cost, etc...) here's some of my postgresql.conf. Feel free to blast me if I did something idiotic here. shared_buffers = 5 effective_cache_size = 1348000 random_page_cost = 3 work_mem = 512000 max_fsm_pages = 8 log_min_duration_statement = 6 fsync = true ( not sure if I'm daring enough to run without this ) wal_buffers = 1000 checkpoint_segments = 64 checkpoint_timeout = 3000 # FOR PG_AUTOVACUUM --# stats_command_string = true stats_row_level = true ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] slow joining very large table to smaller ones
On Jul 14, 2005, at 7:15 PM, John A Meinel wrote: Is the distribution of your rows uneven? Meaning do you have more rows with a later id than an earlier one? There are definitely some id's that will have many times more than the others. If I group and count them, the top 10 are fairly dominant in the table. Hmm.. How to do it permanantly? Well you could always issue "set join_collapse set 1; select * from " But obviously that isn't what you prefer. :) I think there are things you can do to make merge join more expensive than a nested loop, but I'm not sure what they are. Maybe someone else has some ideas to encourage this behavior for future work? Setting it on a per-connection basis is doable, but would add some burden to us in code. What I really don't understand is that the estimates dropped as well. The actual number of estimate rows drops to 3k instead of > 1M. The real question is why does the planner think it will be so expensive? select count(*) from k_b join k_r using (incidentid) where k_b.id=107 and k_r.id=94; count --- 373 Well, this says that they are indeed much more selective. Each one has > 1k rows, but together you end up with only 400. Is this a bad thing? Is this not "selective enough" to make it much faster? Overall, I'm much happier now after seeing the new plan come about, if I can find a way to make that join_collapse behavior permanent, I can certainly live with these numbers. Thanks again for your continued efforts. -Dan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] slow joining very large table to smaller ones
On Jul 14, 2005, at 5:12 PM, John A Meinel wrote: Dan Harris wrote: Well, postgres is estimating around 500 rows each, is that way off? Try just doing: EXPLAIN ANALYZE SELECT incidentid FROM k_b WHERE id = 107; EXPLAIN ANALYZE SELECT incidentid FROM k_r WHERE id = 94; Once again, do this and post the results. We might just need to tweak your settings so that it estimates the number of rows correctly, and we don't need to do anything else. Ok, sorry I missed these the first time through: explain analyze select incidentid from k_b where id = 107; QUERY PLAN Index Scan using k_b_idx on k_b (cost=0.00..1926.03 rows=675 width=14) (actual time=0.042..298.394 rows=2493 loops=1) Index Cond: (id = 107) Total runtime: 299.103 ms select count(*) from k_b; count 698350 ( sorry! I think I said this one only had tens of thousands in it ) explain analyze select incidentid from k_r where id = 94; QUERY PLAN - Index Scan using k_r_idx on k_r (cost=0.00..2137.61 rows=757 width=14) (actual time=0.092..212.187 rows=10893 loops=1) Index Cond: (id = 94) Total runtime: 216.498 ms (3 rows) select count(*) from k_r; count 671670 That one is quite a bit slower, yet it's the same table structure and same index as k_b, also it has fewer records. I did run VACUUM ANALYZE immediately before running these queries. It seems a lot better with the join_collapse set. \ Well, the planner is powerful enough to flatten nested selects. To make it less "intelligent" you can do: SET join_collapse_limit 1; or SET join_collapse_limit 0; Which should tell postgres to not try and get tricky with your query. Again, *usually* the planner knows better than you do. So again just do it to see what you get. Ok, when join_collapse_limit = 1 I get this now: explain analyze select recordtext from eventactivity join ( select incidentid from k_r join k_b using (incidentid) where k_r.id = 94 and k_b.id = 107 ) a using (incidentid ); QUERY PLAN --- Nested Loop (cost=0.00..156509.08 rows=2948 width=35) (actual time=1.555..340.625 rows=24825 loops=1) -> Nested Loop (cost=0.00..5361.89 rows=6 width=28) (actual time=1.234..142.078 rows=366 loops=1) -> Index Scan using k_b_idx on k_b (cost=0.00..1943.09 rows=681 width=14) (actual time=0.423..56.974 rows=2521 loops=1) Index Cond: (id = 107) -> Index Scan using k_r_idx on k_r (cost=0.00..5.01 rows=1 width=14) (actual time=0.031..0.031 rows=0 loops=2521) Index Cond: ((k_r.id = 94) AND ((k_r.incidentid)::text = ("outer".incidentid)::text)) -> Index Scan using eventactivity1 on eventactivity (cost=0.00..25079.55 rows=8932 width=49) (actual time=0.107..0.481 rows=68 loops=366) Index Cond: ((eventactivity.incidentid)::text = ("outer".incidentid)::text) Total runtime: 347.975 ms MUCH better! Maybe you can help me understand what I did and if I need to make something permanent to get this behavior from now on? If you have analyzed recently can you do: SELECT relname, reltuples FROM pg_class WHERE relname='eventactivity'; It is a cheaper form than "SELECT count(*) FROM eventactivity" to get an approximate estimate of the number of rows. But if it isn't too expensive, please also give the value from SELECT count(*) FROM eventactivity. Again, that helps us know if your tables are up-to-date. Sure: select relname, reltuples from pg_class where relname='eventactivity'; relname| reltuples ---+- eventactivity | 3.16882e+07 select count(*) from eventactivity; count -- 31871142 I don't know how selective your keys are, but one of these queries should probably structure it better for the planner. It depends a lot on how selective your query is. eventactivity currently has around 36 million rows in it. There should only be maybe 200-300 incidentids at most that will be matched with the combination of k_b and k_r. That's why I was thinking I could somehow get a list of just the incidentids that matched the id = 94 and id = 107 in k_b and k_r first. Then, I would only need to grab a few hundred out of 36 million rows from eventactivity. Well, you can also try: SELECT count(*) FROM k_b JOIN k_r USING (i
Re: [PERFORM] slow joining very large table to smaller ones
On Jul 14, 2005, at 9:42 AM, John A Meinel wrote: You might try giving it a little bit more freedom with: EXPLAIN ANALYZE SELECT recordtext FROM eventactivity, k_r, k_b WHERE eventactivity.incidentid = k_r.incidentid AND eventactivity.incidentid = k_b.incidentid AND k_r.id = 94 AND k_b.id = 107 -- AND k_r.incidentid = k_b.incidentid ; I'm pretty sure that would give identical results, just let the planner have a little bit more freedom about how it does it. Also the last line is commented out, because I think it is redundant. Ok, I tried this one. My ssh keeps getting cut off by a router somewhere between me and the server due to inactivity timeouts, so all I know is that both the select and explain analyze are taking over an hour to run. Here's the explain select for that one, since that's the best I can get. explain select recordtext from eventactivity,k_r,k_b where eventactivity.incidentid = k_r.incidentid and eventactivity.incidentid = k_b.incidentid and k_r.id = 94 and k_b.id = 107; QUERY PLAN -- Merge Join (cost=9624.61..4679590.52 rows=151009549 width=35) Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") -> Merge Join (cost=4766.92..4547684.26 rows=16072733 width=117) Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") -> Index Scan using eventactivity1 on eventactivity (cost=0.00..4186753.16 rows=46029271 width=49) -> Sort (cost=4766.92..4771.47 rows=1821 width=68) Sort Key: (k_b.incidentid)::text -> Index Scan using k_b_idx on k_b (cost=0.00..4668.31 rows=1821 width=68) Index Cond: (id = 107) -> Sort (cost=4857.69..4862.39 rows=1879 width=68) Sort Key: (k_r.incidentid)::text -> Index Scan using k_r_idx on k_r (cost=0.00..4755.52 rows=1879 width=68) Index Cond: (id = 94) (13 rows) You might also try: EXPLAIN ANALYZE SELECT recordtext FROM eventactivity JOIN k_r USING (incidentid) JOIN k_b USING (incidentid) WHERE k_r.id = 94 AND k_b.id = 107 ; Similar results here. The query is taking at least an hour to finish. explain select recordtext from eventactivity join k_r using ( incidentid ) join k_b using (incidentid ) where k_r.id = 94 and k_b.id = 107; QUERY PLAN -- Merge Join (cost=9542.77..4672831.12 rows=148391132 width=35) Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") -> Merge Join (cost=4726.61..4542825.87 rows=15930238 width=117) Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") -> Index Scan using eventactivity1 on eventactivity (cost=0.00..4184145.43 rows=46000104 width=49) -> Sort (cost=4726.61..4731.13 rows=1806 width=68) Sort Key: (k_b.incidentid)::text -> Index Scan using k_b_idx on k_b (cost=0.00..4628.92 rows=1806 width=68) Index Cond: (id = 107) -> Sort (cost=4816.16..4820.82 rows=1863 width=68) Sort Key: (k_r.incidentid)::text -> Index Scan using k_r_idx on k_r (cost=0.00..4714.97 rows=1863 width=68) Index Cond: (id = 94) (13 rows) Also, if possible give us the EXPLAIN ANALYZE so that we know if the planner is making accurate estimates. (You might send an EXPLAIN while waiting for the EXPLAIN ANALYZE to finish) You can also try disabling merge joins, and see how that changes things. Are there any negative sideaffects of doing this? Well, postgres is estimating around 500 rows each, is that way off? Try just doing: EXPLAIN ANALYZE SELECT incidentid FROM k_b WHERE id = 107; EXPLAIN ANALYZE SELECT incidentid FROM k_r WHERE id = 94; And see if postgres estimates the number of rows properly. I assume you have recently VACUUM ANALYZEd, which means you might need to update the statistics target (ALTER TABLE k_b ALTER COLUMN incidientid SET STATISTICS 100) default is IIRC 10, ranges from 1-1000, higher is more accurate, but makes ANALYZE slower. Right now, it looks like pg is trying to sort the entire eventactivity table for the merge join which is taking several minutes to do. Can I rephrase this so that it does the searching through k_r and k_b FIRST and then go into eventactivity using the index on incidentid? It seems like that shouldn't be too hard to make fast but my SQL query skills are only average. To me, it looks like it is doing an index scan (on k_b.id) through k_b first, sorting the results by incidentid, then merge joining that with eventactivity. I'm guessing you actually want it to merge k_b and k_r to get extra selec
Re: [PERFORM] Quad Opteron stuck in the mud
On Jul 14, 2005, at 9:47 AM, Alvaro Herrera wrote: On Thu, Jul 14, 2005 at 12:28:05AM -0600, Dan Harris wrote: . Ext3 must really be crappy for postgres, or at least is on this box. Were you using the default journal settings for ext3? Yes, I was. Next time I get a chance to reboot this box, I will try writeback and compare the benchmarks to my previous config. Thanks for the tip. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] slow joining very large table to smaller ones
I'm trying to improve the speed of this query: explain select recordtext from eventactivity inner join ( select incidentid from k_r where id = 94 ) a using ( incidentid ) inner join ( select incidentid from k_b where id = 107 ) b using ( incidentid ); QUERY PLAN -- Merge Join (cost=2747.29..4249364.96 rows=11968693 width=35) Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") -> Merge Join (cost=1349.56..4230052.73 rows=4413563 width=117) Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") -> Index Scan using eventactivity1 on eventactivity (cost=0.00..4051200.28 rows=44519781 width=49) -> Sort (cost=1349.56..1350.85 rows=517 width=68) Sort Key: (k_b.incidentid)::text -> Index Scan using k_b_idx on k_b (cost=0.00..1326.26 rows=517 width=68) Index Cond: (id = 107) -> Sort (cost=1397.73..1399.09 rows=542 width=68) Sort Key: (k_r.incidentid)::text -> Index Scan using k_r_idx on k_r (cost=0.00..1373.12 rows=542 width=68) Index Cond: (id = 94) (13 rows) There are many millions of rows in eventactivity. There are a few ten-thousand rows in k_r and k_b. There is an index on 'incidentid' in all three tables. There should only be less than 100 rows matched in k_r and k_b total. That part on its own is very very fast. But, it should have those 100 or so incidentids extracted in under a second and then go into eventactivity AFTER doing that. At least, that's my intention to make this fast. Right now, it looks like pg is trying to sort the entire eventactivity table for the merge join which is taking several minutes to do. Can I rephrase this so that it does the searching through k_r and k_b FIRST and then go into eventactivity using the index on incidentid? It seems like that shouldn't be too hard to make fast but my SQL query skills are only average. Thanks -Dan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Quad Opteron stuck in the mud
On Jul 14, 2005, at 12:12 AM, Greg Stark wrote: Dan Harris <[EMAIL PROTECTED]> writes: I keep the entire database vacuumed regularly. How often is "regularly"? Well, once every day, but there aren't a ton of inserts or updates going on a daily basis. Maybe 1,000 total inserts? Also, if you've done occasional massive batch updates like you describe here you may need a VACUUM FULL or alternatively a CLUSTER command to compact the table -- vacuum identifies the free space but if you've doubled the size of your table with a large update that's a lot more free space than you want hanging around waiting to be used. I have a feeling I'm going to need to do a cluster soon. I have done several mass deletes and reloads on it. For example, as I'm writing this, I am running an UPDATE statement that will affect a small part of the table, and is querying on an indexed boolean field. ... update eventactivity set ftindex = false where ftindex = true; ( added the where clause because I don't want to alter where ftindex is null ) It's definitely worthwhile doing an "EXPLAIN UPDATE..." to see if this even used the index. It sounds like it did a sequential scan. I tried that, and indeed it was using an index, although after reading Simon's post, I realize that was kind of dumb to have an index on a bool. I have since removed it. Sequential scans during updates are especially painful. If there isn't free space lying around in the page where the updated record lies then another page has to be used or a new page added. If you're doing a massive update you can exhaust the free space available making the update have to go back and forth between the page being read and the end of the table where pages are being written. This is great info, thanks. # vmstat output ( as I am waiting for this to finish ): procs ---memory-- ---swap-- -io --system-- cpu r b swpd freebuff cache si sobibo in cs us sy id wa 0 1 5436 2823908 26140 918370401 2211 540 694 336 9 2 76 13 [I assume you ran "vmstat 10" or some other interval and then waited for at least the second line? The first line outputted from vmstat is mostly meaningless] Yeah, this was at least 10 or so down the list ( the last one before ctrl-c ) Um. That's a pretty meager i/o rate. Just over 2MB/s. The cpu is 76% idle which sounds fine but that could be one processor pegged at 100% while the others are idle. If this query is the only one running on the system then it would behave just like that. Well, none of my processors had ever reached 100% until I changed to ext2 today ( read below for more info ) Is it possible you have some foreign keys referencing these records that you're updating? In which case every record being updated might be causing a full table scan on another table (or multiple other tables). If those tables are entirely in cache then it could cause these high cpu low i/o symptoms. No foreign keys or triggers. Ok, so I remounted this drive as ext2 shortly before sending my first email today. It wasn't enough time for me to notice the ABSOLUTELY HUGE difference in performance change. Ext3 must really be crappy for postgres, or at least is on this box. Now that it's ext2, this thing is flying like never before. My CPU utilization has skyrocketed, telling me that the disk IO was constraining it immensely. I always knew that it might be a little faster, but the box feels like it can "breathe" again and things that used to be IO intensive and run for an hour or more are now running in < 5 minutes. I'm a little worried about not having a journalized file system, but that performance difference will keep me from switching back ( at least to ext3! ). Maybe someday I will try XFS. I would be surprised if everyone who ran ext3 had this kind of problem, maybe it's specific to my kernel, raid controller, I don't know. But, this is amazing. It's like I have a new server. Thanks to everyone for their valuable input and a big thanks to all the dedicated pg developers on here who make this possible! -Dan ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Quad Opteron stuck in the mud
On Jul 13, 2005, at 2:17 PM, Stephen Frost wrote: Could you come up w/ a test case that others could reproduce where explain isn't returning? This was simply due to my n00bness :) I had always been doing explain analyze, instead of just explain. Next time one of these queries comes up, I will be sure to do the explain without analyze. FYI that update query I mentioned in the initial thread just finished after updating 8.3 million rows. -Dan ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Quad Opteron stuck in the mud
On Jul 13, 2005, at 1:11 PM, John A Meinel wrote: I might be wrong, but there may be something much more substantially wrong than slow i/o. John Yes, I'm afraid of that too. I just don't know what tools I should use to figure that out. I have some 20 other databases on this system, same schema but varying sizes, and the small ones perform very well. It feels like there is an O(n) increase in wait time that has recently become very noticeable on the largest of them. -Dan ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Quad Opteron stuck in the mud
So sorry, I forgot to mention I'm running version 8.0.1 Thanks ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Quad Opteron stuck in the mud
Gurus, A table in one of my databases has just crossed the 30 million row mark and has begun to feel very sluggish for just about anything I do with it. I keep the entire database vacuumed regularly. And, as long as I'm not doing a sequential scan, things seem reasonably quick most of the time. I'm now thinking that my problem is IO because anything that involves heavy ( like a seq scan ) IO seems to slow to a crawl. Even if I am using indexed fields to grab a few thousand rows, then going to sequential scans it gets very very slow. I have also had the occurrence where queries will not finish for days ( I eventually have to kill them ). I was hoping to provide an explain analyze for them, but if they never finish... even the explain never finishes when I try that. For example, as I'm writing this, I am running an UPDATE statement that will affect a small part of the table, and is querying on an indexed boolean field. I have been waiting for over an hour and a half as I write this and it still hasn't finished. I'm thinking "I bet Tom, Simon or Josh wouldn't put up with this kind of wait time..", so I thought I would see if anyone here had some pointers. Maybe I have a really stupid setting in my conf file that is causing this. I really can't believe I am at the limits of this hardware, however. The query: update eventactivity set ftindex = false where ftindex = true; ( added the where clause because I don't want to alter where ftindex is null ) The table: Column|Type | Modifiers -+-+--- entrydate | timestamp without time zone | incidentid | character varying(40) | statustype | character varying(20) | unitid | character varying(20) | recordtext | character varying(255) | recordtext2 | character varying(255) | insertdate | timestamp without time zone | ftindex | boolean | Indexes: eventactivity1 btree (incidentid), eventactivity_entrydate_idx btree (entrydate), eventactivity_ftindex_idx btree (ftindex), eventactivity_oid_idx btree (oid) The hardware: 4 x 2.2GHz Opterons 12 GB of RAM 4x10k 73GB Ultra320 SCSI drives in RAID 0+1 1GB hardware cache memory on the RAID controller The OS: Fedora, kernel 2.6.6-1.435.2.3smp ( redhat stock kernel ) filesystem is mounted as ext2 # vmstat output ( as I am waiting for this to finish ): procs ---memory-- ---swap-- -io --system-- cpu r b swpd free buff cache si sobibo incs us sy id wa 0 1 5436 2823908 26140 918370401 2211 540 694 336 9 2 76 13 # iostat output ( as I am waiting for this to finish ): avg-cpu: %user %nice%sys %iowait %idle 9.190.002.19 13.08 75.53 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn cciss/c0d0 329.26 17686.03 4317.57 161788630 39496378 # This is a dedicated postgresql server, so maybe some of these settings are more liberal than they should be? relevant ( I hope ) postgresql.conf options are: shared_buffers = 5 effective_cache_size = 1348000 random_page_cost = 3 work_mem = 512000 max_fsm_pages = 8 log_min_duration_statement = 6 fsync = true ( not sure if I'm daring enough to run without this ) wal_buffers = 1000 checkpoint_segments = 64 checkpoint_timeout = 3000 # FOR PG_AUTOVACUUM --# stats_command_string = true stats_row_level = true Thanks in advance, Dan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] investigating slow queries through pg_stat_activity
I've got some queries generated by my application that will, for some reason, run forever until I kill the pid. Yet, when I run the queries manually to check them out, they usually work fine. To get more information about these queries, I'm writing a utility to take snapshots of pg_stat_activity every 5 minutes. If it finds a query that runs for longer than 15 minutes, it will trap the query so I can run 'explain analyze' on it and see where the weakness is. However, the problem I have is that pg_stat_activity only returns the first n (255?) characters of the SQL as "current_query", so it gets chopped off at the end. I would very much like to find out how I can get the *entire* query that is active. Is this possible? Also, I'm sure some people will respond with "turn on query logging".. I've explored that option and the formatting of the log file and the fact that EVERY query is logged is not what I'm after for this project. The "infinite-running" queries are unpredictable and may only happen once a week. Logging 24/7 in anticipation of one of these occurrences is not something I'd like to do. Thanks, Dan Harris ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Query tuning help
On May 8, 2005, at 8:32 PM, Russell Smith wrote: I have run this, and while it is very fast, I'm concerned it's not doing what I need. How fast is very fast? It took 35 seconds to complete versus ~450 my old way. select distinct em.incidentid, ea.recordtext as retdata, eg.long, eg.lat FROM em JOIN ea ON (em.incidentid = ea.incidentid AND em.entrydate >= '2005-1-1 00:00' AND em.entrydate <= '2005-5-9 00:00' AND (ea.recordtext like '%RED%' OR ea.recordtext like '%CORVETTE%')) JOIN eg ON em.incidentid = eg.incidentid WHERE em.incidentid IN (select distinct em.incidentid, ea.recordtext as retdata, eg.long, eg.lat FROM em JOIN ea ON (em.incidentid = ea.incidentid AND em.entrydate >= '2005-1-1 00:00' AND em.entrydate <= '2005-5-9 00:00' AND ea.recordtext like '%CORVETTE%')) JOIN eg ON em.incidentid = eg.incidentid) AND em.incidentid IN (select distinct em.incidentid, ea.recordtext as retdata, eg.long, eg.lat FROM em JOIN ea ON (em.incidentid = ea.incidentid AND em.entrydate >= '2005-1-1 00:00' AND em.entrydate <= '2005-5-9 00:00' AND ea.recordtext like '%RED%')) JOIN eg ON em.incidentid = eg.incidentid) Yes, it is nasty, but so was my previous query :) So long as this is faster, I'm ok with that. I'll see if i can make this work. Thank you very much. -Dan ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Query tuning help
On May 8, 2005, at 8:06 PM, Josh Berkus wrote: If I were to use tsearch2 for full-text indexing, would I need to create another table that merges all of my recordtext rows into a single 'text' field type? No. Read the OpenFTS docs, they are fairly clear on how to set up a simple FTS index. (TSearch2 ~~ OpenFTS) If so, this is where I run into problems, as my logic also needs to match multiple words in their original order. I have been reading the Tsearch2 docs and either I don't understand something or I'm not communicating my situation clearly enough. It seems that Tsearch2 has a concept of "document". And, in everything I am reading, they expect your "document" to be all contained in a single row. Since my words can be spread across multiple rows, I don't see that Tsearch2 will combine all 'recordtext' row values with the same "incidentid" into a single vector. Am I overlooking something in the docs? I'm doing something fairly similar on one of my projects and it works very well. I'd be curious what similarities they have? Is it the searching across multiple rows or the order of words? The limitations on TSearch2 indexes are: 1) they are expensive to update, so your data loads would be noticably slower. 2) they are only fast when cached in RAM (and when cached, are *very* fast). So if you have a variety of other processes that tend to fill up RAM between searches, you may find them less useful. 3) You have to create a materialized index column next to recordtext, which will increase the size of the table. Duly noted. If this method can search across rows, I'm willing to accept this overhead for the speed it would add. In the meantime, is there any way I can reach my goal without Tsearch2 by just restructuring my query to narrow down the results by date first, then seq scan for the 'likes'? -Dan ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Query tuning help
On May 8, 2005, at 6:51 PM, Russell Smith wrote: On Mon, 9 May 2005 09:20 am, Dan Harris wrote: You cannot use an index for %CORVETTE%, or %RED%. There is no way for the index to know if a row had that in the middle without scanning the whole index. So it's much cheaper to do a sequence scan. While I believe you, I'm confused by this line in my original EXPLAIN ANALYZE: -> Index Scan using ea1 on ea (cost=0.00..2736.43 rows=42 width=47) (actual time=2.085..2.309 rows=2 loops=473) Index Cond: ((ea.incidentid)::text = ("outer".incidentid)::text) Filter: (((recordtext)::text ~~ '%RED%'::text) OR ((recordtext)::text ~~ '%CORVETTE%'::text)) Doesn't that mean it was using an index to filter? Along those lines, before I created index 'ea1', the query was much much slower. So, it seemed like creating this index made a difference. One possible way to make the query faster is to limit based on date, as you will only get about 700 rows. And then don't use subselects, as they are doing full sequence scans. I think this query does what you do above, and I think it will be faster, but I don't know. I REALLY like this idea! If I could just filter by date first and then sequential scan through those, it should be very manageable. Hopefully I can keep this goal while still accommodating the requirement listed in my next paragraph. select distinct em.incidentid, ea.recordtext as retdata, eg.long, eg.lat FROM em JOIN ea ON (em.incidentid = ea.incidentid AND em.entrydate >= '2005-1-1 00:00' AND em.entrydate <= '2005-5-9 00:00' AND ea.recordtext like '%RED%' AND ea.recordtext like '%CORVETTE%') JOIN eg ON em.incidentid = eg.incidentid WHERE (recordtext like '%RED%' or recordtext like '%CORVETTE%' ); I have run this, and while it is very fast, I'm concerned it's not doing what I need. Here's the situation: Due to the format of the systems with which I integrate ( I have no control over these formats ), we will get these 'recordtext' values one line at a time, accumulating over time. The only way I can find to make this work is to insert a new record for each line. The problem is, that when someone wants to search multiple keywords, they expect these words to be matched across multiple records with a given incident number. For a very simple example: IncidentID DateRecordtext -- - --- 1 2005-05-01 14:21 blah blah blah RED blah blah 2005-05-01 14:23 not what we are looking for 1 2005-05-02 02:05 blah CORVETTE blah blah So, doing a search with an 'and' condition, e.g. WHERE RECORDTEXT LIKE '%RED%' AND RECORDTEXT LIKE '%CORVETTE%' , will not match because the condition will only be applied to a single row of recordtext at a time, not a whole group with the same incident number. If I were to use tsearch2 for full-text indexing, would I need to create another table that merges all of my recordtext rows into a single 'text' field type? If so, this is where I run into problems, as my logic also needs to match multiple words in their original order. I may also receive additional updates to the previous data. In that case, I need to replace the original record with the latest version of it. If I have already concatenated these rows into a single field, the logic to in-line replace only the old text that has changed is very very difficult at best. So, that's the reason I had to do two subqueries in my example. Please tell me if I misunderstood your logic and it really will match given my condition above, but it didn't seem like it would. Thanks again for the quick responses! This list has been a great resource for me. -Dan ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Query tuning help
Sorry to bother everyone with yet another "my query isn't using an index" problem but I am over my head on this one.. I am open to ways of restructuring this query to perform better. I have a table, 'ea', with 22 million rows in it. VACUUM ANALYZE has been just run on the table. This is the result of: explain analyze select distinct em.incidentid, ea.recordtext as retdata, eg.long, eg.lat from ea, em, eg where em.incidentid = ea.incidentid and em.incidentid = eg.incidentid and em.entrydate >= '2005-1-1 00:00' and em.entrydate <= '2005-5-9 00:00' and ea.incidentid in ( select incidentid from ea where recordtext like '%RED%' ) and ea.incidentid in ( select incidentid from ea where recordtext like '%CORVETTE%' ) and ( recordtext like '%RED%' or recordtext like '%CORVETTE%' ) order by em.entrydate - ANALYZE RESULTS - Unique (cost=774693.72..774693.76 rows=1 width=159) (actual time=446787.056..446787.342 rows=72 loops=1) -> Sort (cost=774693.72..774693.72 rows=1 width=159) (actual time=446787.053..446787.075 rows=72 loops=1) Sort Key: em.incidentid, public.ea.recordtext, eg.long, eg.lat -> Nested Loop (cost=771835.10..774693.71 rows=1 width=159) (actual time=444378.655..446786.746 rows=72 loops=1) -> Nested Loop (cost=771835.10..774688.81 rows=1 width=148) (actual time=444378.532..446768.381 rows=72 loops=1) -> Nested Loop IN Join (cost=771835.10..774678.88 rows=2 width=81) (actual time=444367.080..446191.864 rows=701 loops=1) -> Nested Loop (cost=771835.10..774572.05 rows=42 width=64) (actual time=444366.859..445463.232 rows=1011 loops=1) -> HashAggregate (cost=771835.10..771835.10 rows=1 width=17) (actual time=444366.702..444368.583 rows=473 loops=1) -> Seq Scan on ea (cost=0.00..771834.26 rows=335 width=17) (actual time=259.746..444358.837 rows=592 loops=1) Filter: ((recordtext)::text ~~ '%CORVETTE%'::text) -> Index Scan using ea1 on ea (cost=0.00..2736.43 rows=42 width=47) (actual time=2.085..2.309 rows=2 loops=473) Index Cond: ((ea.incidentid)::text = ("outer".incidentid)::text) Filter: (((recordtext)::text ~~ '%RED%'::text) OR ((recordtext)::text ~~ '%CORVETTE%'::text)) -> Index Scan using ea1 on ea (cost=0.00..2733.81 rows=42 width=17) (actual time=0.703..0.703 rows=1 loops=1011) Index Cond: (("outer".incidentid)::text = (ea.incidentid)::text) Filter: ((recordtext)::text ~~ '%RED%'::text) -> Index Scan using em_incidentid_idx on em (cost=0.00..4.95 rows=1 width=67) (actual time=0.820..0.821 rows=0 loops=701) Index Cond: (("outer".incidentid)::text = (em.incidentid)::text) Filter: ((entrydate >= '2005-01-01 00:00:00'::timestamp without time zone) AND (entrydate <= '2005-05-09 00:00:00'::timestamp without time zone)) -> Index Scan using eg_incidentid_idx on eg (cost=0.00..4.89 rows=1 width=79) (actual time=0.245..0.246 rows=1 loops=72) Index Cond: (("outer".incidentid)::text = (eg.incidentid)::text) Total runtime: 446871.880 ms (22 rows) - EXPLANATION - The reason for the redundant LIKE clause is that first, I only want those "incidentid"s that contain the words 'RED' and 'CORVETTE'. BUT, those two words may exist across multiple records with the same incidentid. Then, I only want to actually work with the rows that contain one of the words. This query will repeat the same logic for however many keywords are entered by the user. I have investigated text searching options and have not found them to be congruous with my application. Why is it choosing a sequential scan one part of the query when searching for the words, yet using an index scan for another part of it? Is there a better way to structure the query to give it better hints? I'm using 8.0.1 on a 4-way Opteron with beefy RAID-10 and 12GB of RAM. Thank you for any advice. -Dan ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Confusion about locales and 'like' indexes
Greetings, I have been beating myself up today trying to optimize indices for a query that uses LIKE. In my research I have read that the locale setting may affect PostgreSQL's choice of seq scan vs index scan. I am running Fedora Core 2 and it appears when I run "locale" that it is set to 'en.US-UTF-8'. Did I fall into a "gotcha" trap here about C vs non-C locales? I'm not much of a C programmer so I have no idea what all this touches and everything has been left as default during PG compilation as well as Fedora install. I can pg_dump and initdb again with --locale=C if this will allow my LIKE queries to use indexes, but I just wanted to know if there was some other place I needed to change locales in the system? e.g. postgresql.conf or env vars? Or, would the initdb and reload alone fix it? I'm running 8.0.1 if that matters. Thanks ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Hardware opinions wanted
I wanted to solicit some opinions on architecture and performance from you guys. I am torn right now between these two systems to replace my aging DB server: 4 x 2.2 GHz Opteron 8GB RAM Ultra320 15kRPM RAID5 with 128MB cache and 2-way 1.2GHz POWER4+ IBM pSeries 615 8GB RAM Ultra320 15kRPM RAID5 with 64MB cache I plan on serving ~80GB of pgsql database on this machine. The current machine handles around 1.5 million queries per day. I am having some trouble finding direct comparisons between the two architectures. The OS will most likely be Linux ( I'm hedging on AIX for now ). The pSeries has 8MB cache per CPU card ( 2 CPU on a card ) while the Opteron has 1MB for each processor. I know the POWER4+ is a very fast machine but I wonder if having more processors in the Opteron system would beat it for database serving? FWIW, they are very close in price. Ignoring the fault-tolerance features of the pSeries, which one would you pick for performance? Thanks, Dan ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] tuning for AIX 5L with large memory
Christopher Browne wrote: We have a couple of these at work; they're nice and fast, although the process of compiling things, well, "makes me feel a little unclean." Thanks very much for your detailed reply, Christopher. Would you mind elaborating on the "makes me feel a little unclean" statement? Also, I'm curious which models you are running and if you have any anecdotal comparisons for perfomance? I'm completely unfamiliar with AIX, so if there are dark corners that await me, I'd love to hear a little more so I can be prepared. I'm going out on a limb here and jumping to an unfamiliar architecture as well as OS, but the IO performance of these systems has convinced me that it's what I need to break out of my I/O limited x86 systems. I suppose when I do get it, I'll just experiment with different sizes of shared memory and run some benchmarks. For the price of these things, they better be some good marks! Thanks again -Dan Harris ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] tuning for AIX 5L with large memory
I will soon have at my disposal a new IBM pSeries server. The main mission for this box will be to serve several pg databases. I have ordered 8GB of RAM and want to learn the best way to tune pg and AIX for this configuration. Specifically, I am curious about shared memory limitations. I've had to tune the shmmax on linux machines before but I'm new to AIX and not sure if this is even required on that platform? Google has not been much help for specifics here. Hoping someone else here has a similar platform and can offer some advice.. Thanks! -Dan Harris ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly