[PERFORM] SQL stupid query plan... terrible performance !
Hi, I have one performance issue... and realy have no idea what's going on... When I set enable_seqscan to 0, query2 runs the same way... upload => 60667 entities uploadfield => 506316 entities Query1: select count(*) from Upload NATURAL JOIN UploadField Where Upload.ShopID = 123123; 181.944 ms Query2: select count(*) from Upload NATURAL JOIN UploadField Where Upload.UploadID = 123123; 1136.024 ms Greetings, Jim J. --- Details: PostgreSQL 7.4.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.2 20030222 (Red Hat Linux 3.2.2-5) QUERY1 PLAN Aggregate (cost=1972.50..1972.50 rows=1 width=0) (actual time=181.657..181.658 rows=1 loops=1) -> Nested Loop (cost=0.00..1972.46 rows=17 width=0) (actual time=181.610..181.610 rows=0 loops=1) -> Seq Scan on upload (cost=0.00..1945.34 rows=2 width=8) (actual time=181.597..181.597 rows=0 loops=1) Filter: (shopid = 123123) -> Index Scan using relationship_3_fk on uploadfield (cost=0.00..13.44 rows=10 width=8) (never executed) Index Cond: ("outer".uploadid = uploadfield.uploadid) Total runtime: 181.944 ms QUERY2 PLAN Aggregate (cost=15886.74..15886.74 rows=1 width=0) (actual time=1135.804..1135.806 rows=1 loops=1) -> Nested Loop (cost=1945.34..15886.69 rows=20 width=0) (actual time=1135.765..1135.765 rows=0 loops=1) -> Seq Scan on uploadfield (cost=0.00..13940.95 rows=10 width=8) (actual time=1135.754..1135.754 rows=0 loops=1) Filter: (123123 = uploadid) -> Materialize (cost=1945.34..1945.36 rows=2 width=8) (never executed) -> Seq Scan on upload (cost=0.00..1945.34 rows=2 width=8) (never executed) Filter: (uploadid = 123123) Total runtime: 1136.024 ms Table "public.upload" Column | Type | Modifiers ++--- uploadid | bigint | not null nativedb | text | not null shopid | bigint | not null Indexes: "pk_upload" primary key, btree (uploadid) "nativedb" btree (nativedb) "uploadshopid" btree (shopid) Table "public.uploadfield" Column | Type | Modifiers ---+--+--- uploadfieldid | bigint | not null fieldnameid | smallint | not null uploadid | bigint | not null Indexes: "pk_uploadfield" primary key, btree (uploadfieldid) "relationship_3_fk" btree (uploadid) "relationship_4_fk" btree (fieldnameid) Foreign-key constraints: "fk_uploadfi_fieldname_fieldnam" FOREIGN KEY (fieldnameid) REFERENCES fieldname(fieldnameid) ON UPDATE RESTRICT ON DELETE RESTRICT "fk_uploadfi_uploadfie_upload" FOREIGN KEY (uploadid) REFERENCES upload(uploadid) ON UPDATE RESTRICT ON DELETE RESTRICT ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] SQL stupid query plan... terrible performance !
2004-06-28 07:48, Tom Lane wrote: Klint Gore <[EMAIL PROTECTED]> writes: On Sun, 27 Jun 2004 23:29:46 -0400, Tom Lane <[EMAIL PROTECTED]> wrote: [yawn...] Cast the constants to bigint. See previous discussions. [ct] Thanks a lot guys. The term "Cast the constants to bigint" It is what I was looking for. I add explicitly ::data_type in my queries and everything works fine now. One more thanks to Tom Lane - After your answer I found your post on the newsgroup about this problem... the date of the post is 2001 year... You are really patience man :) But I really have no idea what term I could use to force goggle to give me solution ;) Greetings, Jim J. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Postgres on RAID5
All, I have a 13 disk (250G each) software raid 5 set using 1 16 port adaptec SATA controller. I am very happy with the performance. The reason I went with the 13 disk raid 5 set was for the space NOT performance. I have a single postgresql database that is over 2 TB with about 500 GB free on the disk. This raid set performs about the same as my ICP SCSI raid controller (also with raid 5). That said, now that postgresql 8 has tablespaces, I would NOT create 1 single raid 5 set, but 3 smaller sets. I also DO NOT have my wal and log's on this raid set, but on a smaller 2 disk mirror. Jim -- Original Message --- From: Greg Stark <[EMAIL PROTECTED]> To: Alex Turner <[EMAIL PROTECTED]> Cc: Greg Stark <[EMAIL PROTECTED]>, Arshavir Grigorian <[EMAIL PROTECTED]>, linux-raid@vger.kernel.org, pgsql-performance@postgresql.org Sent: 14 Mar 2005 15:17:11 -0500 Subject: Re: [PERFORM] Postgres on RAID5 > Alex Turner <[EMAIL PROTECTED]> writes: > > > a 14 drive stripe will max out the PCI bus long before anything else, > > Hopefully anyone with a 14 drive stripe is using some combination of 64 bit > PCI-X cards running at 66Mhz... > > > the only reason for a stripe this size is to get a total accessible > > size up. > > Well, many drives also cuts average latency. So even if you have no need for > more bandwidth you still benefit from a lower average response time by adding > more drives. > > -- > greg > > ---(end of broadcast)--- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match --- End of Original Message --- ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Help with rewriting query
How about SELECT p_id, f_id FROM person as p LEFT JOIN (SELECT f.p_id, max(f.id), f_item FROM food) as f ON p.p_id = f.p_id Create an index on Food (p_id, seq #) This may not gain any performance, but worth a try. I don't have any data similar to this to test it on. Let us know. I assume that the food id is a sequential number across all people. Have you thought of a date field and a number representing what meal was last eaten, i.e. 1= breakfast, 2 = mid morning snack etc. Or a date field and the food id code? Junaili Lie wrote: Hi, The suggested query below took forever when I tried it. In addition, as suggested by Tobias, I also tried to create index on food(p_id, id), but still no goal (same query plan). Here is the explain: TEST1=# explain select f.p_id, max(f.id) from Food f, Person p where (f.p_id = p.id) group by p.id; QUERY PLAN GroupAggregate (cost=0.00..214585.51 rows=569 width=16) -> Merge Join (cost=0.00..200163.50 rows=2884117 width=16) Merge Cond: ("outer".id = "inner".p_id) -> Index Scan using person_pkey on person p (cost=0.00..25.17 rows=569 width=8) -> Index Scan using person_id_food_index on food f (cost=0.00..164085.54 rows=2884117 width=16) (5 rows) TEST1=# explain select p.id, (Select f.id from food f where f.p_id=p.id order by f.id desc limit 1) from person p; QUERY PLAN --- Seq Scan on Person p (cost=1.00..17015.24 rows=569 width=8) SubPlan -> Limit (cost=0.00..12.31 rows=1 width=8) -> Index Scan Backward using food_pkey on food f (cost=0.00..111261.90 rows=9042 width=8) Filter: (p_id = $0) (5 rows) any ideas or suggestions is appreciate. On 6/8/05, Tobias Brox <[EMAIL PROTECTED]> wrote: [Junaili Lie - Wed at 12:34:32PM -0700] select f.p_id, max(f.id) from person p, food f where p.id=f.p_id group by f.p_id will work. But I understand this is not the most efficient way. Is there another way to rewrite this query? (maybe one that involves order by desc limit 1) eventually, try something like select p.id,(select f.id from food f where f.p_id=p.id order by f.id desc limit 1) from person p not tested, no warranties. Since subqueries can be inefficient, use "explain analyze" to see which one is actually better. This issue will be solved in future versions of postgresql. -- Tobias Brox, +47-91700050 Tallinn ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Stored Procedure
create function abc() returns setof RECORD ... then to call it you would do select * from abc() as (a text,b int,...); -- Original Message --- From: Yves Vindevogel <[EMAIL PROTECTED]> To: pgsql-performance@postgresql.org Sent: Tue, 22 Nov 2005 19:29:37 +0100 Subject: [PERFORM] Stored Procedure > Is there another way in PG to return a recordset from a function than > to declare a type first ? > > create function fnTest () returns setof > myDefinedTypeIDontWantToDefineFirst ... > > Met vriendelijke groeten, > Bien à vous, > Kind regards, > > Yves Vindevogel > Implements --- End of Original Message --- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] File Systems Compared
On Dec 11, 2006, at 12:54 PM, Bruno Wolff III wrote: On Wed, Dec 06, 2006 at 08:55:14 -0800, Mark Lewis <[EMAIL PROTECTED]> wrote: Anyone run their RAIDs with disk caches enabled, or is this akin to having fsync off? Disk write caches are basically always akin to having fsync off. The only time a write-cache is (more or less) safe to enable is when it is backed by a battery or in some other way made non-volatile. So a RAID controller with a battery-backed write cache can enable its own write cache, but can't safely enable the write-caches on the disk drives it manages. This appears to be changing under Linux. Recent kernels have write barriers implemented using cache flush commands (which some drives ignore, so you need to be careful). In very recent kernels, software raid using raid 1 will also handle write barriers. To get this feature, you are supposed to mount ext3 file systems with the barrier=1 option. For other file systems, the parameter may need to be different. But would that actually provide a meaningful benefit? When you COMMIT, the WAL data must hit non-volatile storage of some kind, which without a BBU or something similar, means hitting the platter. So I don't see how enabling the disk cache will help, unless of course it's ignoring fsync. Now, I have heard something about drives using their stored rotational energy to flush out the cache... but I tend to suspect urban legend there... -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PERFORM] how to plan for vacuum?
On Jan 25, 2007, at 10:33 AM, Ray Stell wrote: On Thu, Jan 25, 2007 at 08:04:49AM -0800, Joshua D. Drake wrote: It really depends on the system. Most of our systems run anywhere from 10-25ms. I find that any more than that, Vacuum takes too long. How do you measure the impact of setting it to 12 as opposed to 15? If you've got a tool that will report disk utilization as a percentage it's very easy; I'll decrease the setting until I'm at about 90% utilization with the system's normal workload (leaving some room for spikes, etc). Sometimes I'll also tune the costs if reads vs. writes are a concern. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Determining server load from client
Dan Use the following plperlu function create or replace function LoadAVG() returns record as $$ use Sys::Statistics::Linux::LoadAVG; my $lxs = new Sys::Statistics::Linux::LoadAVG; my $stats = $lxs->get; return $stats; $$ language plperlu; select * from LoadAVg() as (avg_1 float,avg_5 float,avg_15 float); The Sys::Statistics::Linux has all kind of info (from the /proc) file system. Jim -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dan Harris Sent: Tuesday, March 20, 2007 8:48 PM To: PostgreSQL Performance Subject: [PERFORM] Determining server load from client I've found that it would be helpful to be able to tell how busy my dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before pounding it with some OLAP-type queries. Specifically, I have a multi-threaded client program that needs to run several thousand sequential queries. I broke it into threads to take advantage of the multi-core architecture of the server hardware. It would be very nice if I could check the load of the server at certain intervals to throttle the number of concurrent queries and mitigate load problems when other processes might be already inducing a significant load. I have seen some other nice back-end things exposed through PG functions ( e.g. database size on disk ) and wondered if there was anything applicable to this. Even if it can't return the load average proper, is there anything else in the pg_* tables that might give me a clue how "busy" the server is for a period of time? I've thought about allowing an ssh login without a keyphrase to log in and capture it. But, the client process is running as an apache user. Giving the apache user a shell login to the DB box does not seem like a smart idea for obvious security reasons... So far, that's all I can come up with, other than a dedicated socket server daemon on the DB machine to do it. Any creative ideas are welcomed :) Thanks -Dan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Fragmentation of WAL files
I was recently running defrag on my windows/parallels VM and noticed a bunch of WAL files that defrag couldn't take care of, presumably because the database was running. What's disturbing to me is that these files all had ~2000 fragments. Now, this was an EnterpriseDB database which means the WAL files were 64MB instead of 16MB, but even having 500 fragments for a 16MB WAL file seems like it would definitely impact performance. Can anyone else confirm this? I don't know if this is a windows-only issue, but I don't know of a way to check fragmentation in unix. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] seeking advise on char vs text or varchar in search table
On Apr 23, 2007, at 7:16 AM, Merlin Moncure wrote: On 4/20/07, chrisj <[EMAIL PROTECTED]> wrote: I have a table that contains a column for keywords that I expect to become quite large and will be used for web searches. I will either index the column or come up with a simple hashing algorithm add the hash key to the table and index that column. I am thinking the max length in the keyword column I need to support is 30, but the average would be less than10 Any suggestions on whether to use char(30), varchar(30) or text, would be appreciated. I am looking for the best performance option, not necessarily the most economical on disk. Don't use char...it pads out the string to the length always. It also has no real advantage over varchar in any practical situation. Think of varchar as text with a maximum length...its no faster or slower but the database will throw out entries based on length (which can be good or a bad thing)...in this case, text feels better. AIUI, char, varchar and text all store their data in *exactly* the same way in the database; char only pads data on output, and in the actual tables it still contains the regular varlena header. The only reason I've ever used char in other databases is to save the overhead of the variable-length information, so I recommend to people to just steer clear of char in PostgreSQL. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] What`s wrong with JFS configuration?
On Apr 25, 2007, at 8:51 AM, Paweł Gruszczyński wrote: where u6 stores Fedora Core 6 operating system, and u0 stores 3 partitions with ext2, ext3 and jfs filesystem. Keep in mind that drives have a faster data transfer rate at the outer-edge than they do at the inner edge, so if you've got all 3 filesystems sitting on that array at the same time it's not a fair test. I heard numbers on the impact of this a *long* time ago and I think it was in the 10% range, but I could be remembering wrong. You'll need to drop each filesystem and create the next one to get a fair comparison. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] What`s wrong with JFS configuration?
Adding -performance back in so others can learn. On Apr 26, 2007, at 9:40 AM, Paweł Gruszczyński wrote: Jim Nasby napisał(a): On Apr 25, 2007, at 8:51 AM, Paweł Gruszczyński wrote: where u6 stores Fedora Core 6 operating system, and u0 stores 3 partitions with ext2, ext3 and jfs filesystem. Keep in mind that drives have a faster data transfer rate at the outer-edge than they do at the inner edge, so if you've got all 3 filesystems sitting on that array at the same time it's not a fair test. I heard numbers on the impact of this a *long* time ago and I think it was in the 10% range, but I could be remembering wrong. You'll need to drop each filesystem and create the next one go get a fair comparison. I thought about it by my situation is not so clear, becouse my hard drive for postgresql data is rather "logical" becouse of RAID array i mode 1+0. My RAID Array is divided like this: Device Boot Start End Blocks Id System /dev/sda1 1 159850 163686384 83 Linux /dev/sda2 159851 319431 163410944 83 Linux /dev/sda3 319432 478742 163134464 83 Linux and partitions are: /dev/sda1 ext2 161117780 5781744 147151720 4% /fs/ext2 /dev/sda2 ext3 160846452 2147848 150528060 2% /fs/ext3 /dev/sda3 jfs 163096512 3913252 159183260 3% /fs/jfs so if RAID 1+0 do not change enything, JFS file system is at third partition wich is at the end of hard drive. Yes, which means that JFS is going to be at a disadvantage to ext3, which will be at a disadvantage to ext2. You should really re-perform the tests with each filesystem in the same location. What about HDD with two magnetic disk`s? Then the speed depending of partition phisical location is more difficult to calculate ;) Propably first is slow, secund is fast in firs halt and slow in secund halt, third is the fastes one. In both cases my JFS partitin should be ath the end on magnetic disk. Am I wrong? I'm not a HDD expert, but as far as I know the number of platters doesn't change anything. When you have multiple platters, the drive essentially splits bytes across all the platters; it doesn't start writing one platter, then switch to another platter. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Feature Request --- was: PostgreSQL Performance Tuning
On Apr 27, 2007, at 3:30 PM, Michael Stone wrote: On Fri, Apr 27, 2007 at 09:27:49AM -0400, Carlos Moreno wrote: Notice that the second part of my suggestion covers this --- have additional switches to initdb so that the user can tell it about estimates on how the DB will be used: estimated size of the DB, estimated percentage of activity that will involve writing, estimated percentage of activity that will be transactions, percentage that will use indexes, percentage of queries that will be complex, etc. etc. If the person knows all that, why wouldn't they know to just change the config parameters? Because knowing your expected workload is a lot easier for many people than knowing what every GUC does. Personally, I think it would be a tremendous start if we just provided a few sample configs like MySQL does. Or if someone wanted to get fancy they could stick a web page somewhere that would produce a postgresql.conf based simply on how much available RAM you had, since that's one of the biggest performance-hampering issues we run into (ie: shared_buffers left at the default of 32MB). -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Feature Request --- was: PostgreSQL Performance Tuning
On May 4, 2007, at 12:11 PM, Josh Berkus wrote: Sebastian, Before inventing a hyper tool, we might consider to provide 3-5 example szenarios for common hardware configurations. This consumes less time and be discussed and defined in a couple of days. This is of course not the correct option for a brandnew 20 spindle Sata 10.000 Raid 10 system but these are probably not the target for default configurations. That's been suggested a number of times, but some GUCs are really tied to the *exact* amount of RAM you have available. So I've never seen how "example configurations" could help. Uh... what GUCs are that exacting on the amount of memory? For a decent, base-line configuration, that is. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] How to Find Cause of Long Vacuum Times - NOOB Question
On May 5, 2007, at 5:57 PM, Yudhvir Singh Sidhu wrote: Problem: Vacuum times jump up from 45 minutes, or 1:30 minutes to 6 + hours overnight, once every 1 to 3 months. Solutions tried: db truncate - brings vacuum times down. Reindexing brings vacuum times down. Does it jump up to 6+ hours just once and then come back down? Or once at 6+ hours does it stay there? Getting that kind of change in vacuum time sounds a lot like you suddenly didn't have enough maintenance_work_mem to remember all the dead tuples in one pass; increasing that setting might bring things back in line (you can increase it on a per-session basis, too). Also, have you considered vacuuming during the day, perhaps via autovacuum? If you can vacuum more often you'll probably get less bloat. You'll probably want to experiment with the vacuum_cost_delay settings to reduce the impact of vacuuming during the day (try setting vacuum_cost_delay to 20 as a starting point). -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Best OS for Postgres 8.2
On May 8, 2007, at 2:59 AM, [EMAIL PROTECTED] wrote: one issue with journaling filesystems, if you journal the data as well as the metadata you end up with a very reliable setup, however it means that all your data needs to be written twice, oncce to the journal, and once to the final location. the write to the journal can be slightly faster then a normal write to the final location (the journal is a sequential write to an existing file), however the need to write twice can effectivly cut your disk I/O bandwidth in half when doing heavy writes. worse, when you end up writing mor ethen will fit in the journal (128M is the max for ext3) the entire system then needs to stall while the journal gets cleared to make space for the additional writes. That's why you want to mount ext3 partitions used with PostgreSQL with data=writeback. Some folks will also use a small filesystem for pg_xlog and mount that as ext2. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] How to Find Cause of Long Vacuum Times - NOOB Question
On May 7, 2007, at 11:10 PM, Yudhvir Singh Sidhu wrote: Jim Nasby wrote: On May 5, 2007, at 5:57 PM, Yudhvir Singh Sidhu wrote: Problem: Vacuum times jump up from 45 minutes, or 1:30 minutes to 6+ hours overnight, once every 1 to 3 months. Solutions tried: db truncate - brings vacuum times down. Reindexing brings vacuum times down. Does it jump up to 6+ hours just once and then come back down? Or once at 6+ hours does it stay there? Getting that kind of change in vacuum time sounds a lot like you suddenly didn't have enough maintenance_work_mem to remember all the dead tuples in one pass; increasing that setting might bring things back in line (you can increase it on a per-session basis, too). Also, have you considered vacuuming during the day, perhaps via autovacuum? If you can vacuum more often you'll probably get less bloat. You'll probably want to experiment with the vacuum_cost_delay settings to reduce the impact of vacuuming during the day (try setting vacuum_cost_delay to 20 as a starting point). It ramps up and I have to run a db truncate to bring it back down. On some machines it creeps up, on others it spikes. I have seen it climb from 6 to 12 to 21 in 3 consequtive days. Well, what's one to do? I have maintenance_work_mem set to 32768 - Is that enough? Depends on how many dead rows there are to be vacuumed. If there's a lot, you could certainly be exceeding maintenance_work_mem. If you look closely at the output of VACUUM VERBOSE you'll see the indexes for a particular table being scanned more than once if all the dead rows can't fit into maintenance_work_mem. I vacuum daily. If you've got high update rates, that very likely might not be often enough. I just turned vacuum verbose on on one of the systems and will find out tomorrow what it shows me. I plan on playing with Max_fsm_ settings tomorrow. And I'll keep you guys up to date. The tail end of vacuumdb -av will tell you exactly how much room is needed in the FSM. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Feature suggestion : FAST CLUSTER
On May 27, 2007, at 12:34 PM, PFC wrote: On Sun, 27 May 2007 17:53:38 +0200, Jim C. Nasby <[EMAIL PROTECTED]> wrote: On Tue, May 22, 2007 at 09:29:00AM +0200, PFC wrote: This does not run a complete sort on the table. It would be about as fast as your seq scan disk throughput. Obviously, the end result is not as good as a real CLUSTER since the table will be made up of several ordered chunks and a range lookup. Therefore, a range lookup on the clustered columns would need at most N seeks, versus 1 for a really clustered table. But it only scans the table once and writes it once, even counting index rebuild. Do you have any data that indicates such an arrangement would be substantially better than less-clustered data? While the little benchmark that will answer your question is running, I'll add a few comments : I have been creating a new benchmark for PostgreSQL and MySQL, that I will call the Forum Benchmark. It mimics the activity of a forum. So far, I have got interesting results about Postgres and InnoDB and will publish an extensive report with lots of nasty stuff in it, in, say, 2 weeks, since I'm doing this in spare time. Anyway, forums like clustered tables, specifically clusteriing posts on (topic_id, post_id), in order to be able to display a page with one disk seek, instead of one seek per post. PostgreSQL humiliates InnoDB on CPU-bound workloads (about 2x faster since I run it on dual core ; InnoDB uses only one core). However, InnoDB can automatically cluster tables without maintenance. This means InnoDB will, even though it sucks and is awfully bloated, run a lot faster than postgres if things become IO- bound, ie. if the dataset is larger than RAM. Postgres needs to cluster the posts table in order to keep going. CLUSTER is very slow. I tried inserting into a new posts table, ordering by (post_id, topic_id), then renaming the new table in place of the old. It is faster, but still slow when handling lots of data. I am trying other approaches, some quite hack-ish, and will report my findings. I assume you meant topic_id, post_id. :) The problem with your proposal is that it does nothing to ensure that posts for a topic stay together as soon as the table is large enough that you can't sort it in a single pass. If you've got a long-running thread, it's still going to get spread out throughout the table. What you really want is CLUSTER CONCURRENTLY, which I believe is on the TODO list. BUT... there's another caveat here: for any post where the row ends up being larger than 2k, the text is going to get TOASTed anyway, which means it's going to be in a separate table, in a different ordering. I don't know of a good way to address that; you can cluster the toast table, but you'll be clustering on an OID, which isn't going to help you. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] dbt2 NOTPM numbers
On Jun 4, 2007, at 1:56 PM, Markus Schiltknecht wrote: Simplistic throughput testing with dd: dd of=test if=/dev/zero bs=10K count=80 80+0 records in 80+0 records out 819200 bytes (8.2 GB) copied, 37.3552 seconds, 219 MB/s pamonth:/opt/dbt2/bb# dd if=test of=/dev/zero bs=10K count=80 80+0 records in 80+0 records out 819200 bytes (8.2 GB) copied, 27.6856 seconds, 296 MB/s I don't think that kind of testing is useful for good raid controllers on RAID5/6, because the controller will just be streaming the data out; it'll compute the parity blocks on the fly and just stream data to the drives as fast as possible. But that's not how writes in the database work (except for WAL); you're writing stuff all over the place, none of which is streamed. So in the best case (the entire stripe being updated is in the controller's cache), at a minimum it's going to have to write data + parity ( * 2 for RAID 6, IIRC) for every write. But any real-sized database is going to be far larger than your raid cache, which means there's a good chance a block being written will no longer have it's stripe in cache. In that case, the controller is going to have to read a bunch of data back off the drive, which is going to clobber performance. Now, add that performance bottleneck on top of your WAL writes and you're in real trouble. BTW, I was thinking in terms of stripe size when I wrote this, but I don't know if good controllers actually need to deal with things at a stripe level, or if they can deal with smaller chunks of a stripe. In either case, the issue is still the number of extra reads going on. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Vacuum takes forever
On May 29, 2007, at 12:03 PM, Joost Kraaijeveld wrote: vacuum_cost_delay = 200 vacuum_cost_page_hit = 6 #vacuum_cost_page_miss = 10 # 0-1 credits #vacuum_cost_page_dirty = 20# 0-1 credits vacuum_cost_limit = 100 I didn't see anyone else mention this, so... Those settings are *very* aggressive. I'm not sure why you upped the cost of page_hit or dropped the cost_limit, but I can tell you the effect: vacuum will sleep at least every 17 pages... even if those pages were already in shared_buffers and vacuum didn't have to dirty them. I really can't think of any reason you'd want to do that. I do find vacuum_cost_delay to be an extremely useful tool, but typically I'll set it to between 10 and 20 and leave the other parameters alone. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] How much ram is too much
On Jun 8, 2007, at 11:31 AM, Dave Cramer wrote: Is it possible that providing 128G of ram is too much ? Will other systems in the server bottleneck ? Providing to what? PostgreSQL? The OS? My bet is that you'll run into issues with how shared_buffers are managed if you actually try and set them to anything remotely close to 128GB. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] dbt2 NOTPM numbers
On Jun 13, 2007, at 11:43 AM, Markus Schiltknecht wrote: In the mean time, I've figured out that the box in question peaked at about 1450 NOTPMs with 120 warehouses with RAID 1+0. I'll try to compare again to RAID 6. Is there any place where such results are collected? There is the ill-used -benchmarks list, but perhaps it would be better if we setup a wiki for this... -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [pgsql-advocacy] [PERFORM] [ADMIN] Postgres VS Oracle
Can we please trim this down to just advocacy? On Jun 18, 2007, at 1:17 PM, Joshua D. Drake wrote: Jonah H. Harris wrote: On 6/18/07, Andreas Kostyrka <[EMAIL PROTECTED]> wrote: As a cynic, I might ask, what Oracle is fearing? As a realist, I might ask, how many times do we have to answer this type of anti-commercial-database flamewar-starting question? Depends? How many times are you going to antagonize the people that ask? 1. It has *nothing* to do with anti-commercial. It is anti- proprietary which is perfectly legitimate. 2. Oracle, Microsoft, and IBM have a "lot" to fear in the sense of a database like PostgreSQL. We can compete in 90-95% of cases where people would traditionally purchase a proprietary system for many, many thousands (if not hundreds of thousands) of dollars. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/ donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Database-wide VACUUM ANALYZE
On Jun 21, 2007, at 3:37 PM, Steven Flatt wrote: Thanks everyone. It appears that we had hacked the 502.pgsql script for our 8.1 build to disable the daily vacuum. I was not aware of this when building and upgrading to 8.2. Much better to change stuff in a config file than to hack installed scripts, for this very reason. :) So it looks like for the past two weeks, that 36 hour db-wide vacuum has been running every 24 hours. Good for it for being reasonably non-intrusive and going unnoticed until now. :) Although apparently not related anymore, I still think it was a good move to change autovacuum_freeze_max_age from 200 million to 2 billion. If you set that to 2B, that means you're 2^31-"2 billion"-100 transactions away from a shutdown when autovac finally gets around to trying to run a wraparound vacuum on a table. If you have any number of large tables, that could be a big problem, as autovac could get tied up on a large table for a long enough period that the table needing to be frozen doesn't get frozen in time. I suspect 1B is a much better setting. I probably wouldn't go past 1.5B. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Volunteer to build a configuration tool
On Jun 23, 2007, at 2:28 PM, Greg Smith wrote: On Thu, 21 Jun 2007, Campbell, Lance wrote: I have a PostgreSQL database that runs on a dedicated server. The server has 24Gig of memory. What would be the max size I would ever want to set the shared_buffers to if I where to relying on the OS for disk caching approach? It seems that no matter how big your dedicated server is there would be a top limit to the size of shared_buffers. It's impossible to say exactly what would work optimally in this sort of situation. The normal range is 25-50% of total memory, but there's no hard reason for that balance; for all we know your apps might work best with 20GB in shared_buffers and only a relatively small 4GB left over for the rest of the OS to use. Push it way up and and see what you get. This is part of why the idea of an "advanced" mode for this tool is suspect. Advanced tuning usually requires benchmarking with as close to real application data as you can get in order to make good forward progress. Agreed. EnterpriseDB comes with a feature called "DynaTune" that looks at things like server memory and sets a best-guess at a bunch of parameters. Truth is, it works fine for 90% of cases, because there's just a lot of installations where tuning postgresql.conf isn't that critical. The real issue is that the "stock" postgresql.conf is just horrible. It was originally tuned for something like a 486, but even the recent changes have only brought it up to the "pentium era" (case in point: 24MB of shared buffers equates to a machine with 128MB of memory, give or take). Given that, I think an 80% solution would be to just post small/medium/large postgresql.conf files somewhere. I also agree 100% with Tom that the cost estimators need serious work. One simple example: nothing in the planner looks at what percent of a relation is actually in shared_buffers. If it did that, it would probably be reasonable to extrapolate that percentage into how much is sitting in kernel cache, which would likely be miles ahead of what's currently done. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] best use of an EMC SAN
On Jul 11, 2007, at 12:39 PM, Chris Browne wrote: - Split off a set (6?) for WAL In my limited testing, 6 drives for WAL would be complete overkill in almost any case. The only example I've ever seen where WAL was able to swamp 2 drives was the DBT testing that Mark Wong was doing at OSDL; the only reason that was the case is because he had somewhere around 70 data drives. I suppose an entirely in-memory database might be able to swamp a 2 drive WAL as well. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] 8.2 -> 8.3 performance numbers
Sorry for the cross-post, but this is performance and advocacy related... Has anyone benchmarked HEAD against 8.2? I'd like some numbers to use in my OSCon lightning talk. Numbers for both with and without HOT would be even better (I know we've got HOT-specific benchmarks, but I want complete 8.2 -> 8.3 numbers). -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] 8.2 -> 8.3 performance numbers
On Jul 20, 2007, at 1:03 PM, Josh Berkus wrote: Jim, Has anyone benchmarked HEAD against 8.2? I'd like some numbers to use in my OSCon lightning talk. Numbers for both with and without HOT would be even better (I know we've got HOT-specific benchmarks, but I want complete 8.2 -> 8.3 numbers). We've done it on TPCE, which is a hard benchmark for PostgreSQL. On that it's +9% without HOT and +13% with HOT. I think SpecJ would show a greater difference, but we're still focussed on benchmarks we can publish (i.e. 8.2.4) right now. Bleh, that's not a very impressive number. Anyone else have something better? -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Affect of Reindexing on Vacuum Times
On Jul 25, 2007, at 11:53 AM, Y Sidhu wrote: I am wondering if reindexing heavily used tables can have an impact on vacuum times. If it does, will the impact be noticeable the next time I vacuum? Please note that I am doing vacuum, not vacuum full. I am on a FreeBSD 6.1 Release, Postgresql is 8.09 Currently I seeing a phenomenon where vacuum times go up beyond 1 hour. After I re-index 3 tables, heavily used, the vacuum times stay up for the next 3 daily vacuums and then come down to 30 to 40 minutes. I am trying to see if there is a relationship between re- indexinf and vacuum times. All other things remain the same. Which means the only change I am performing is re-indexing. Reindex will shrink index sizes, which will speed up vacuuming. But that alone doesn't explain what you're seeing, which is rather odd. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Performance on writable views
On Aug 11, 2007, at 8:58 AM, Joshua D. Drake wrote: Heikki Linnakangas wrote: Enrico Weigelt wrote: I'm often using writable views as interfaces to clients, so they only see "virtual" objects and never have to cope with the actual storage, ie. to give some client an totally denormalized view of certain things, containing only those information required for certain kind of operations. Now I've got the strange feeling that this makes updates slow, since it always has to run the whole view query to fetch an record to be updated (ie. to get OLD.*). There is some overhead in rewriting the query, but it shouldn't be significantly slower than issuing the statements behind the view directly. I wouldn't worry about it, unless you have concrete evidence that it's causing problems. I don't know about that, at least when using rules for partitioning the impact can be significant in comparison to triggers. That's because you have to re-evaluate the input query for each rule that's defined, so even if you only have rules for 2 partitions in a table (which is really about the minimum you can have, at least for some period of overlap surrounding the time when you switch to a new partition), you're looking at evaluating every input query twice. In this case, the rules presumably are just simply re-directing DML, so there'd only be one rule in play at a time. That means the only real overhead is in the rewrite engine. -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] requested shared memory size overflows size_t
Remove me from your email traffic. > Date: Thu, 24 Jun 2010 23:05:06 -0400 > Subject: Re: [PERFORM] requested shared memory size overflows size_t > From: robertmh...@gmail.com > To: alvhe...@commandprompt.com > CC: craig_ja...@emolecules.com; pgsql-performance@postgresql.org > > On Thu, Jun 24, 2010 at 7:19 PM, Alvaro Herrera > wrote: > > Excerpts from Craig James's message of jue jun 24 19:03:00 -0400 2010: > > > >> select relname, pg_relation_size(relname) from pg_class > >> where pg_get_userbyid(relowner) = 'emol_warehouse_1' > >> and relname not like 'pg_%' > >> order by pg_relation_size(relname) desc; > >> ERROR: relation "rownum_temp" does not exist > >> > >> emol_warehouse_1=> select relname from pg_class where relname = > >> 'rownum_temp'; > >> relname > >> -- > >> rownum_temp > >> (1 row) > > > > What's the full row? I'd just add a "WHERE relkind = 'r'" to the above > > query anyway. > > Yeah - also, it would probably be good to call pg_relation_size on > pg_class.oid rather than pg_class.relname, to avoid any chance of > confusion over which objects are in which schema. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise Postgres Company > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance _ http://clk.atdmt.com/UKM/go/19780/direct/01/ We want to hear all your funny, exciting and crazy Hotmail stories. Tell us now
Re: [PERFORM] Help with bulk read performance
On Nov 1, 2010, at 9:15 AM, Dan Schaffer wrote: > We have an application that needs to do bulk reads of ENTIRE Postgres tables > very quickly (i.e. select * from table). We have observed that such > sequential scans run two orders of magnitude slower than observed raw disk > reads (5 MB/s versus 100 MB/s). Part of this is due to the storage overhead > we have observed in Postgres. In the example below, it takes 1 GB to store > 350 MB of nominal data. However that suggests we would expect to get 35 MB/s > bulk read rates. > > Observations using iostat and top during these bulk reads suggest that the > queries are CPU bound, not I/O bound. In fact, repeating the queries yields > similar response times. Presumably if it were an I/O issue the response > times would be much shorter the second time through with the benefit of > caching. > > We have tried these simple queries using psql, JDBC, pl/java stored > procedures, and libpq. In all cases the client code ran on the same box as > the server. > We have experimented with Postgres 8.1, 8.3 and 9.0. > > We also tried playing around with some of the server tuning parameters such > as shared_buffers to no avail. > > Here is uname -a for a machine we have tested on: > > Linux nevs-bdb1.fsl.noaa.gov 2.6.18-194.17.1.el5 #1 SMP Mon Sep 20 07:12:06 > EDT 2010 x86_64 x86_64 x86_64 GNU/Linux > > A sample dataset that reproduces these results looks like the following > (there are no indexes): > > Table "bulk_performance.counts" > Column | Type | Modifiers > +-+--- > i1 | integer | > i2 | integer | > i3 | integer | > i4 | integer | > > There are 22 million rows in this case. > > We HAVE observed that summation queries run considerably faster. In this > case, > > select sum(i1), sum(i2), sum(i3), sum(i4) from bulk_performance.counts > > runs at 35 MB/s. > > > Our business logic does operations on the resulting data such that the output > is several orders of magnitude smaller than the input. So we had hoped that > by putting our business logic into stored procedures (and thus drastically > reducing the amount of data flowing to the client) our throughput would go > way up. This did not happen. > > So our questions are as follows: > > Is there any way using stored procedures (maybe C code that calls SPI > directly) or some other approach to get close to the expected 35 MB/s doing > these bulk reads? Or is this the price we have to pay for using SQL instead > of some NoSQL solution. (We actually tried Tokyo Cabinet and found it to > perform quite well. However it does not measure up to Postgres in terms of > replication, data interrogation, community support, acceptance, etc). Have you by chance tried EXPLAIN ANALYZE SELECT * FROM bulk_performance.counts? That will throw away the query results, which removes client-server considerations. Also, when you tested raw disk IO, did you do it with an 8k block size? That's the default size of a Postgres block, so all of it's IO is done that way. What does iostat show you? Are you getting a decent number of read requests/second? -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Help with bulk read performance
On Dec 14, 2010, at 9:27 AM, Andy Colson wrote: > Is this the same thing Nick is working on? How'd he get along? > > http://archives.postgresql.org/message-id/4cd1853f.2010...@noaa.gov So it is. The one I replied to stood out because no one had replied to it; I didn't see the earlier email. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Help with bulk read performance
BTW, have you tried prepared statements? bytea is most likely faster (in part) due to less parsing in the backend. Prepared statements would eliminate that parsing step. On Dec 14, 2010, at 10:07 AM, Nick Matheson wrote: > Hey all- > > Glad to know you are still interested... ;) > > Didn't mean to leave you hanging, the holiday and all have put some bumps in > the road. > > Dan my co-worker might be able to post some more detailed information here, > but here is a brief summary of what I am aware of: > > 1. We have not tested any stored procedure/SPI based solutions to date. > 2. The COPY API has been the best of the possible solutions explored to date. > 3. We were able to get rates on the order of 35 MB/s with the original > problem this way. > 4. Another variant of the problem we were working on included some metadata > fields and 300 float values (for this we tried three variants) > a. 300 float values as columns > b. 300 float in a float array column > c. 300 floats packed into a bytea column > Long story short on these three variants a and b largely performed the same. > C was the winner and seems to have improved the throughput on multiple > counts. 1. it reduces the data transmitted over the wire by a factor of two > (float columns and float arrays have a 2x overhead over the raw data > requirement.) 2. this reduction seems to have reduced the cpu burdens on the > server side thus producing a better than the expected 2x speed. I think the > final numbers left us somewhere in the 80-90 MB/s. > > Thanks again for all the input. If you have any other questions let us know. > Also if we get results for the stored procedure/SPI route we will try and > post, but the improvements via standard JDBC are such that we aren't really > pressed at this point in time to get more throughput so it may not happen. > > Cheers, > > Nick >> On 12/14/2010 9:41 AM, Jim Nasby wrote: >>> On Dec 14, 2010, at 9:27 AM, Andy Colson wrote: >>>> Is this the same thing Nick is working on? How'd he get along? >>>> >>>> http://archives.postgresql.org/message-id/4cd1853f.2010...@noaa.gov >>> >>> So it is. The one I replied to stood out because no one had replied to it; >>> I didn't see the earlier email. >>> -- >>> Jim C. Nasby, Database Architect j...@nasby.net >>> 512.569.9461 (cell) http://jim.nasby.net >>> >>> >>> >> >> Oh.. I didn't even notice the date... I thought it was a new post. >> >> But still... (and I'll cc Nick on this) I'd love to hear an update on how >> this worked out. >> >> Did you get it to go fast? What'd you use? Did the project go over budget >> and did you all get fired? COME ON MAN! We need to know! :-) >> >> -Andy > -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Strange optimization - xmin,xmax compression :)
On Dec 17, 2010, at 8:46 PM, Robert Haas wrote: > 2010/12/6 pasman pasmański : >> hello. >> >> i tested how are distributed values xmin,xmax on pages. >> in my tables . typically there are no more than 80 records >> on pages. >> >> maybe its possible to compress xmin & xmax values to >> 1 byte/per record (+table of transactions per page)? >> its reduce the space when more than 1 record is >> from the same transaction. > > Not a bad idea, but not easy to implement, I think. Another option that would help even more for data warehousing would be storing the XIDs at the table level, because you'll typically have a very limited number of transactions per table. But as Robert mentioned, this is not easy to implement. The community would probably need to see some pretty compelling performance numbers to even consider it. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] encourging bitmap AND
On Dec 26, 2010, at 11:24 AM, Tom Lane wrote: > If you're doing interval queries enough to worry about having an index > for them, you really want an indexing structure that is designed to do > interval queries efficiently. BTW, one way to accomplish that is to transform your data into geometric shapes and then index them accordingly. Prior to the work Jeff Davis has done on time intervals it was common to treat time as points and ranges as lines or boxes. While we no longer need to play those games for time, I don't think there's an equivalent for non-time datatypes. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] CPU bound
On Dec 20, 2010, at 12:47 AM, Mladen Gogala wrote: > Good time accounting is the most compelling reason for having a wait event > interface, like Oracle. Without the wait event interface, one cannot really > tell where the time is spent, at least not without profiling the database > code, which is not an option for a production database. Out of curiosity, have you tried using the information that Postgres exposes to dtrace? I suspect it comes close to what you can get directly out of Oracle... -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Table partitioning problem
On Mar 8, 2011, at 9:45 AM, Samba GUEYE wrote: > I have a problem with table partitioning because i have a foreign key applied > on the partionned table and it throw a constraint violation error during > inserts. > I saw on the manual > (http://www.postgresql.org/docs/8.4/interactive/ddl-inherit.html caveats > section) that it's a limitation due to postgrsql table inheritance select > queries performance are really bad without partitionning and i'm looking for > a workaround to this foreign key problem or another solution for improve > performance for larges tables. Actually, this sounds more like having a foreign key pointed at a parent table in an inheritance tree; which flat-out doesn't do what you'd want. Can you tell us what the foreign key constraint actually is, and what the inheritance setup for the tables in the FK is? -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] NULLS LAST performance
On Feb 24, 2011, at 3:47 AM, Mathieu De Zutter wrote: > > which will index optimize your sql. Interesting that 'null last' > > fools disallows index usage even when the index was created with > > nullls last as the default. > > The problem is that his query needs to scan the index in DESC order, > which means it's effectively NULLS FIRST, which doesn't match the > requested sort order. > > Merlin, Tom, > > Thanks for explaining the behavior! > > Any chance that the planner could get smarter about this? In my naive view, > it would just be telling the planner that it can disregard "NULLS" when > searching for an index, in case the column is known to be NOT NULL. Unfortunately, I don't think the planner actually has that level of knowledge. A more reasonable fix might be to teach the executor that it can do 2 scans of the index: one to get non-null data and a second to get null data. I don't know if the use case is prevalent enough to warrant the extra code though. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query planner issue
with Postgresql 7.2.1 you will need to do BOTH vacuum and reindex and with a table that gets many updates/deletes, you should run vacuum more than daily. Both issues have been solved in 8.1. Jim -- Original Message --- From: Emmanuel Lacour <[EMAIL PROTECTED]> To: pgsql-performance@postgresql.org Sent: Mon, 30 Jan 2006 23:57:11 +0100 Subject: [PERFORM] Query planner issue > Hi everybody, > > I have the following problem, on a test server, if I do a fresh import > of production data then run > 'explain analyze select count(*) from mandats;' > > I get this result: > > Aggregate (cost=6487.32..6487.32 rows=1 width=0) (actual time=607.61..607.61 > rows=1 loops=1) > -> Seq Scan on mandats (cost=0.00..6373.26 rows=45626 width=0) (actual > time=0.14..496.20 rows=45626 > loops=1) Total runtime: 607.95 msec > > On the production server, if I do the same (without other use of the server), > I get: > > Aggregate (cost=227554.33..227554.33 rows=1 width=0) (actual > time=230705.79..230705.79 rows=1 loops=1) > -> Seq Scan on mandats (cost=0.00..227440.26 rows=45626 width=0) (actual > time=0.03..230616.64 rows=45760 > loops=1) Total runtime: 230706.08 msec > > Is there anyone having an idea on how yo solve this poor performances? I > think it is caused by many delete/insert on this table every day, but > how to solve it, I need to run this qury each hour :(. I run > vacuum each night, postgresql is unfortunatly 7.2.1 :( (no upgrade > before 2 or 3 months). > > -- > Emmanuel Lacour Easter-eggs > 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité > Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 41 35 00 76 > mailto:[EMAIL PROTECTED] -http://www.easter-eggs.com > > ---(end of broadcast)--- > TIP 6: explain analyze is your friend --- End of Original Message --- ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Postgres and Ingres R3 / SAN
Adding -performance back; you should do a reply-all if you want to reply to list messages. > From: Jeremy Haile [mailto:[EMAIL PROTECTED] > > Can you point us at more info about this? I can't even find > a website > > for Ingres... > > Ingres is based off of the same original codebase that PostgreSQL was > based upon (a long time ago) It is owned by Computer > Associates and was > open sourced last year. It supports clustering and replication, and > I've seen an Ingres install set up as a cluster backed by a > SAN before. > I just haven't talked to anyone (at least unbiased) who has used this > type of setup in production, and I'm not fully aware of the > advantages/disadvantages of this type of setup with Ingres. > Since this > group seems pretty knowledgable about performance advantages > (and we are > currently running PostgreSQL), I wanted to see if there were any > experiences or opinions. > > Here is a link to their website: > http://opensource.ca.com/projects/ingres > > > > Perhaps if you posted your performance requirements someone > could help > > point you to a solution that would meet them. > > This is honestly more of a curiousity question at the moment, > so I don't > have any specific numbers. We definitely have a requirement for > failover in the case of a machine failure, so we at least need > Master->Slave replication. However, I wanted to solicit > information on > clustering alternatives as well, since scalability will likely be a > future problem for our database. Ahh, ok... that's likely a much different requirement than true clustering. What a lot of folks do right now is segregate their application into a read-only stream and the more interactive read-write streams, and then use Slony to replicate data to a number of machines for the read-only work. This way anyone who's hitting the site read-only (and can handle some possible delay) will just hit one of the slave machines. People who are doing interactive work (updating data) will hit the master. Since most applications do far more reading than they do writing, this is a pretty good way to load-balance. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] 1 TB of memory
PostgreSQL tuned to the max and still too slow? Database too big to fit into memory? Here's the solution! http://www.superssd.com/ products/tera-ramsan/ Anyone purchasing one will be expected to post benchmarks! :) -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] update == delete + insert?
go with design 1, update does = delete + insert. -- Original Message --- From: "Craig A. James" <[EMAIL PROTECTED]> To: pgsql-performance@postgresql.org Sent: Mon, 20 Mar 2006 14:49:43 -0800 Subject: [PERFORM] update == delete + insert? > I've seen it said here several times that "update == delete + insert". On > the other hand, I've noticed that > "alter table [add|drop] column ..." is remarkably fast, even for very large > tables, which leads me to wonder > whether each column's contents are in a file specifically for that column. > > My question: Suppose I have a very "wide" set of data, say 100 columns, and > one of those columns will be > updated often, but the others are fairly static. I have two choices: > > Design 1: >create table a ( > id integer, > frequently_updated integer); > >create table b( > id integer, > infrequently_updated_1 integer, > infrequently_updated_2 integer, > infrequently_updated_3 integer, > ... etc. > infrequently_updated_99 integer); > > Design 2: >create table c( > id integer, > frequently_updated integer, > infrequently_updated_1 integer, > infrequently_updated_2 integer, > infrequently_updated_3 integer, > ... etc. > infrequently_updated_99 integer); > > If "update == delete + insert" is strictly true, then "Design 2" would be > poor since 99 columns would be moved > around with each update. But if columns are actually stored in separate > files, the Designs 1 and 2 would be > essentially equivalent when it comes to vacuuming. > > Thanks, > Craig > > ---(end of broadcast)--- > TIP 6: explain analyze is your friend --- End of Original Message --- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Query using SeqScan instead of IndexScan
On Apr 2, 2006, at 6:30 PM, Josh Berkus wrote: But just as a follow up question to your #1 suggestion, I have 8 GB of ram in my production server. You're saying to set the effective_cache_size then to 5 GB roughly? Somewhere around 655360? Currently it is set to 65535. Is that something that's OS dependent? I'm not sure how much memory my server sets aside for disk caching. Yes, about. It's really a judgement call; you're looking for the approximate combined RAM available for disk caching and shared mem. However, this is just used as a way of estimating the probability that the data you want is cached in memory, so you're just trying to be order-of-magnitude accurate, not to-the-MB accurate. FWIW, I typically set effective_cache_size to the amount of memory in the machine minus 1G for the OS and various other daemons, etc. But as Josh said, as long as your somewhere in the ballpark it's probably good enough. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Query using SeqScan instead of IndexScan
On Apr 1, 2006, at 12:51 PM, Brendan Duddridge wrote: from SELECT * FROM pg_stats WHERE tablename='table' AND attname='category_id' I find correlation on category_product for category_id is 0.643703 Would setting the index on category_id to be clustered help with this? It would absolutely help on the query in question. In my experience, a correlation of 0.64 is too low to allow an index scan to be used for anything but a tiny number of rows. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] freebsd/softupdates for data dir
On Apr 4, 2006, at 10:41 AM, Vivek Khera wrote: On Apr 3, 2006, at 10:10 PM, Mark Kirkwood wrote: I've always left them on, and never had any issues...(even after unscheduled power loss - which happened here yesterday). As I understand it, the softupdate code reorders *metadata* operations, and does not alter data operations - so the effect of fysnc(2) on a preexisting file is not changed by softupdates being on or off. This is also my understanding, and I also leave softupdates on for the data partition. Even if it doesn't improve performance, it will not reduce it, and otherwise does no harm with respect to postgres' disk usage. More importantly, it allows the system to come up and do fsck in the background. If you've got a large database that's a pretty big benefit. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Better index stategy for many fields with few values
Adding -performance back in -Original Message-From: Oscar Picasso [mailto:[EMAIL PROTECTED]Sent: Wednesday, April 12, 2006 5:51 PMTo: Jim NasbySubject: Re: [PERFORM] Better index stategy for many fields with few values I would like to try it.However in an other post I added that contrary to what I stated initially all the paramXX columns are not mandatory in the query. So it seems that requirement make the problem more complexe.Doesn't this new requirement rule out this solution? No, just group the columns logically. By the way I have test to index each column individually and check what happens in relation to bitscan map. My test table is 1 million rows. The explain analyze command shows that a bit scan is sometimes used but I still end up with queries that can take up to 10s which is way to much."Jim C. Nasby" <[EMAIL PROTECTED]> wrote: On Wed, Apr 12, 2006 at 02:59:32PM +0200, Markus Schaber wrote:> > I was thinking about using a multicolumns index, but I have read that> > we should limit multicolumns indice to at most 2 or 3 columns.> > Yes, that's true, the index overhead gets too high.> > > I was also thinking about about using a functional index.> > If there's a logical relation between those values that they can easily> combined, that may be a good alternative.How would that be any better than just doing a multi-column index?> I just had another weird idea:> > As your paramXX values can have only 10 parameters, it also might be> feasible to use a bunch of 10 conditional indices, like:> > CREATE INDEX foo1 ON table (param1, param2 WHERE param0='1st value';> CREATE INDEX foo2 ON table (param1, param2 WHERE param0='2nd value';> CREATE INDEX foo3 ON table (param1, param2 WHERE param0='3rd value';> [...]Not all that weird; it's known as index partitioning.-- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED]Pervasive Software http://pervasive.com work: 512-231-6117vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461---(end of broadcast)---TIP 4: Have you searched our list archives?http://archives.postgresql.org Yahoo! Messenger with Voice. PC-to-Phone calls for ridiculously low rates.
Re: [PERFORM] multi column query
You need to run EXPLAIN ANALYZE. Also, what's random_page_cost set to? And the output of \d chkpfw_tr_dy_dimension. The cost for that index scan looks way too high. And please reply-all so that the list is included. > -Original Message- > From: Sriram Dandapani [mailto:[EMAIL PROTECTED] > Sent: Wednesday, April 12, 2006 7:48 PM > To: Jim Nasby > Subject: RE: [PERFORM] multi column query > > > I executed enable_seqscan=off and then ran an explain plan on > the query > > UPDATE chkpfw_tr_dy_dimension >SET summcount = a.summcount + b.summcount, >bytes = a.bytes + b.bytes, >duration = a.duration + b.duration >from chkpfw_tr_dy_dimension a, > c_chkpfw_dy_tr_updates b >WHERE a.firstoccurrence = b.firstoccurrence > AND a.customerid_id = b.customerid_id >AND a.sentryid_id = b.sentryid_id > AND a.node_id = b.node_id >AND a.interface_id = b.interface_id >AND a.source_id = b.source_id >AND a.destination_id = b.destination_id >AND a.sourceport_id = b.sourceport_id >AND a.destinationport_id = b.destinationport_id >AND a.inoutbound_id = b.inoutbound_id >AND a.action_id = b.action_id >AND a.protocol_id = b.protocol_id >AND a.service_id = b.service_id >AND a.sourcezone_id = b.sourcezone_id >AND a.destinationzone_id = > b.destinationzone_id; > > > > Here is the query plan > > > "Nested Loop (cost=20036.18..221851442.39 rows=1 width=166)" > " -> Merge Join (cost=10036.18..121620543.75 rows=1 width=96)" > "Merge Cond: (("outer".firstoccurrence = > "inner".firstoccurrence) AND ("outer".sentryid_id = > "inner".sentryid_id) > AND ("outer".node_id = "inner".node_id))" > "Join Filter: (("outer".customerid_id = "inner".customerid_id) > AND ("outer".interface_id = "inner".interface_id) AND > ("outer".source_id > = "inner".source_id) AND ("outer".destination_id = > "inner".destination_id) AND ("outer".sourceport_id = "inner".s (..)" > "-> Index Scan using chkpfw_tr_dy_idx1 on > chkpfw_tr_dy_dimension a (cost=0.00..21573372.84 rows=6281981 > width=88)" > "-> Sort (cost=10036.18..10037.38 rows=480 > width=136)" > " Sort Key: b.firstoccurrence, b.sentryid_id, b.node_id" > " -> Seq Scan on c_chkpfw_dy_tr_updates b > (cost=1.00..10014.80 rows=480 width=136)" > " -> Seq Scan on chkpfw_tr_dy_dimension > (cost=1.00..100168078.81 rows=6281981 width=70)" > > -Original Message- > From: Jim C. Nasby [mailto:[EMAIL PROTECTED] > Sent: Wednesday, April 12, 2006 5:44 PM > To: Sriram Dandapani > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] multi column query > > On Wed, Apr 12, 2006 at 05:32:32PM -0700, Sriram Dandapani wrote: > > Hi > > > > When I update a table that has 20 columns and the where clause > includes > > 16 of the columns (this is a data warehousing type update > on aggregate > > fields), > > > > The bitmap scan is not used by the optimizer. The table is > indexed on > 3 > > of the 20 fields. The update takes really long to finish (on a 6 > million > > row table) > > > > Do I need to do some "magic" with configuration to turn on bitmap > scans. > > No. What's explain analyze of the query show? What's it doing now? > Seqscan? You might try set enable_seqscan=off and see what that does. > -- > Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] > Pervasive Software http://pervasive.comwork: 512-231-6117 > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Blocks read for index scans
While working on determining a good stripe size for a database, I realized it would be handy to know what the average request size is. Getting this info is a simple matter of joining pg_stat_all_tables and pg_statio_all_tables and doing some math, but there's one issue I've found; it appears that there's no information on how many heap blocks were read in by an index scan. Is there any way to get that info? -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] pgmemcache
On Apr 13, 2006, at 12:38 PM, Tom Lane wrote: Christian Storm <[EMAIL PROTECTED]> writes: Not sure if I follow why this is a problem. Seems like it would be beneficial to have both BEFORE and AFTER COMMIT triggers. With the BEFORE COMMIT trigger you would have the ability to 'un- commit' (rollback) the transaction. With the AFTER COMMIT trigger you wouldn't have that option because the commit has already been successful. However, with an AFTER COMMIT you would be able to trigger other downstream events that rely on a transaction successfully committing. An AFTER COMMIT trigger would have to be in a separate transaction. What happens if there's more than one, and one of them fails? Even more to the point, if it's a separate transaction, don't you have to fire all these triggers again when you commit that transaction? The idea seems circular. I suspect that in reality you'd probably want each on-commit trigger to be it's own transaction, but it depends on what you're doing. Also, I can't see any use for them where you'd actually be interacting with the database, only if you were calling something externally via a function. One example would be sending an email out when a certain table changes; in many cases it's better to let the change happen even if the email can't be sent, and you'd rather not send an email if the transaction just ends up rolling back for some reason. And yes, you'd have to ensure you didn't code yourself up a trigger loop. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Blocks read for index scans
Adding -performance back in... > From: Steve Poe [mailto:[EMAIL PROTECTED] > Jim, > > I could be way off, but doesn't from pg_statio_user_tables > contain this > information? http://www.postgresql.org/docs/8.1/interactive/monitoring-stats.html#MONITORING-STATS-VIEWS states: "numbers of disk blocks read and buffer hits in all indexes of that table" That leads me to believe that it's only tracking index blocks read, and not heap blocks read. One could presume that each index row read as reported by pg_stat_all_tables would represent a heap block read, but a large number of those would (hopefully) have already been in shared_buffers. > On Thu, 2006-04-13 at 13:00 -0500, Jim Nasby wrote: > > While working on determining a good stripe size for a database, I > > realized it would be handy to know what the average request > size is. > > Getting this info is a simple matter of joining pg_stat_all_tables > > and pg_statio_all_tables and doing some math, but there's > one issue > > I've found; it appears that there's no information on how > many heap > > blocks were read in by an index scan. Is there any way to > get that info? -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Quick Performance Poll
Simon, I have many databases over 1T with the largest being ~6T. All of my databases store telecom data, such as call detail records. The access is very fast when looking for a small subset of the data. For servers, I am using white box intel XEON and P4 systems with SATA disks, 4G of memory. SCSI is out of our price range, but if I had unlimited $ I would go with SCSI /SCSI raid instead. Jim -- Original Message --- From: "Simon Dale" <[EMAIL PROTECTED]> To: Sent: Thu, 20 Apr 2006 14:18:58 +0100 Subject: [PERFORM] Quick Performance Poll > Hi, > > I was just wondering whether anyone has had success with storing more > than 1TB of data with PostgreSQL and how they have found the > performance. > > We need a database that can store in excess of this amount and still > show good performance. We will probably be implementing several tables > with foreign keys and also indexes which will obviously impact on both > data size and performance too. > > Many thanks in advance, > > Simon > Visit our Website at http://www.rm.com > > This message is confidential. You should not copy it or disclose its contents > to anyone. You may use and apply > the information for the intended purpose only. Internet communications are > not secure; therefore, RM does not > accept legal responsibility for the contents of this message. Any views or > opinions presented are those of the > author only and not of RM. If this email has come to you in error, please > delete it, along with any > attachments. Please note that RM may intercept incoming and outgoing email > communications. > > Freedom of Information Act 2000 > This email and any attachments may contain confidential information belonging > to RM. Where the email and any > attachments do contain information of a confidential nature, including > without limitation information relating > to trade secrets, special terms or prices these shall be deemed for the > purpose of the Freedom of Information > Act 2000 as information provided in confidence by RM and the disclosure of > which would be prejudicial to RM's > commercial interests. > > This email has been scanned for viruses by Trend ScanMail. --- End of Original Message --- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Quick Performance Poll
First of all this is NOT a single table and yes I am using partitioning and the constaint exclusion stuff. the largest set of tables is over 2T. I have not had to rebuild the biggest database yet, but for a smaller one ~1T the restore takes about 12 hours including many indexes on both large and small tables Jim -- Original Message --- From: "Luke Lonergan" <[EMAIL PROTECTED]> To: [EMAIL PROTECTED], "Simon Dale" <[EMAIL PROTECTED]>, pgsql-performance@postgresql.org Sent: Thu, 20 Apr 2006 07:31:33 -0700 Subject: Re: [PERFORM] Quick Performance Poll > Jim, > > On 4/20/06 6:36 AM, "Jim Buttafuoco" <[EMAIL PROTECTED]> wrote: > > > The access is very fast when looking for a small subset of the data. > > I guess you are not using indexes because building a (non bitmap) index on > 6TB on a single machine would take days if not weeks. > > So if you are using table partitioning, do you have to refer to each child > table separately in your queries? > > - Luke > > ---(end of broadcast)--- > TIP 1: if posting/reading through Usenet, please send an appropriate >subscribe-nomail command to [EMAIL PROTECTED] so that your >message can get through to the mailing list cleanly --- End of Original Message --- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Quick Performance Poll
I have been following your work with great interest. I believe I spoke to someone from Greenplum at linux world in Boston a couple of weeks ago. -- Original Message --- From: "Luke Lonergan" <[EMAIL PROTECTED]> To: [EMAIL PROTECTED], "Simon Dale" <[EMAIL PROTECTED]>, pgsql-performance@postgresql.org Sent: Thu, 20 Apr 2006 08:03:10 -0700 Subject: Re: [PERFORM] Quick Performance Poll > Jim, > > On 4/20/06 7:40 AM, "Jim Buttafuoco" <[EMAIL PROTECTED]> wrote: > > > First of all this is NOT a single table and yes I am using partitioning and > > the constaint exclusion stuff. the largest > > set of tables is over 2T. I have not had to rebuild the biggest database > > yet, > > but for a smaller one ~1T the restore > > takes about 12 hours including many indexes on both large and small tables > > You would probably benefit greatly from the new on-disk bitmap index feature > in Bizgres Open Source. It's 8.1 plus the sort speed improvement and > on-disk bitmap index. > > Index creation and sizes for the binary version are in the table below (from > a performance report on bizgres network. The version in CVS tip on > pgfoundry is much faster on index creation as well. > > The current drawback to bitmap index is that it isn't very maintainable > under insert/update, although it is safe for those operations. For now, you > have to drop index, do inserts/updates, rebuild index. > > We'll have a version that is maintained for insert/update next. > > - Luke > > # Indexed Columns Create Time (seconds) Space Used (MBs) > BITMAP BTREE BITMAP BTREE > 1 L_SHIPMODE454.8 2217.1 58 1804 > 2 L_QUANTITY547.2 937.81171804 > 3 L_LINENUMBER 374.5 412.459 1285 > 4 L_SHIPMODE, L_QUANTITY948.7 2933.4 1762845 > 5 O_ORDERSTATUS 83.5241.35 321 > 6 O_ORDERPRIORITY 108.5 679.111 580 > 7 C_MKTSEGMENT 10.951.3 1 45 > 8 C_NATIONKEY 8.3 9.3 2 32 --- End of Original Message --- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Better way to write aggregates?
Jan, I write queries like this CREATE VIEW parent_childs AS SELECT c.parent, count(c.state) as childtotal, sum(case when c.state = 1 then 1 else 0 end) as childstate1, sum(case when c.state = 2 then 1 else 0 end) as childstate2, sum(case when c.state = 3 then 1 else 0 end) as childstate3 FROM child c GROUP BY parent; -- Original Message --- From: Jan Dittmer <[EMAIL PROTECTED]> To: pgsql-performance@postgresql.org Sent: Fri, 21 Apr 2006 10:37:10 +0200 Subject: [PERFORM] Better way to write aggregates? > Hi, > > I more or less often come about the problem of aggregating a > child table counting it's different states. The cleanest solution > I've come up with so far is: > > BEGIN; > CREATE TABLE parent ( > id int not null, > name text not null, > UNIQUE(id) > ); > > CREATE TABLE child ( > name text not null, > state int not null, > parent int not null references parent(id) > ); > > CREATE VIEW parent_childs AS > SELECT > c.parent, > count(c.state) as childtotal, > count(c.state) - count(nullif(c.state,1)) as childstate1, > count(c.state) - count(nullif(c.state,2)) as childstate2, > count(c.state) - count(nullif(c.state,3)) as childstate3 > FROM child c > GROUP BY parent; > > CREATE VIEW parent_view AS > SELECT p.*, > pc.* > FROM parent p > LEFT JOIN parent_childs pc ON (p.id = pc.parent); > COMMIT; > > Is this the fastest way to build these aggregates (not considering > tricks with triggers, etc)? The count(state) - count(nullif(...)) looks > a bit clumsy. > I also experimented with a pgsql function to sum these up, but considered > it as not-so-nice and it also always forces a sequential scan on the > data. > > Thanks for any advice, > > Jan > > ---(end of broadcast)--- > TIP 6: explain analyze is your friend --- End of Original Message --- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Better way to write aggregates?
I don't think an index will help you with this query. -- Original Message --- From: Jan Dittmer <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: pgsql-performance@postgresql.org Sent: Fri, 21 Apr 2006 14:35:33 +0200 Subject: Re: [PERFORM] Better way to write aggregates? > Jim Buttafuoco wrote: > > Jan, > > > > I write queries like this > > > > CREATE VIEW parent_childs AS > > SELECT > > c.parent, > > count(c.state) as childtotal, > > sum(case when c.state = 1 then 1 else 0 end) as childstate1, > > sum(case when c.state = 2 then 1 else 0 end) as childstate2, > > sum(case when c.state = 3 then 1 else 0 end) as childstate3 > > FROM child c > > GROUP BY parent; > > It would help if booleans could be casted to integer 1/0 :-) But > performance wise it should be about the same? I think I'll > run some tests later today with real data. > Would an index on NULLIF(state,1) help count(NULLIF(state,1)) ? > Can one build an index on (case when c.state = 3 then 1 else 0 end)? > > Thanks, > > Jan --- End of Original Message --- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] slow variable against int??
Please cc the list so others can help. > From: Witold Strzelczyk [mailto:[EMAIL PROTECTED] > On Friday 12 May 2006 00:04, you wrote: > > Yes, thanks but method is not a point. Actually, it is a point. Databases don't like doing things procedurally. Using a stored procedure to operate on a set of data is very often the wrong way to go about it. In the case of ranking, I'm extremely doubtful that you'll ever get a procedure to opperate anywhere near as fast as native SQL. > Can You tell me why > > select into inGameRating count(game_result)+1 > from users > where game_result > 2984; > > tooks ~100 ms and > > select into inGameRating count(game_result)+1 > from users > where game_result > inRow.game_result; > > where inRow.game_result = 2984 tooks ~1100 ms!? No, I can't. What's EXPLAIN ANALYZE show? > btw. I must try your temp sequence but if it is not as quick > as my new (and > final) function I'll send if to you. > > > If you're trying to come up with ranking then you'll be much happier > > using a sequence and pulling from it using an ordered > select. See lines > > 19-27 in http://lnk.nu/cvs.distributed.net/9bu.sql for an example. > > Depending on what you're doing you might not need the temp table. > > > > On Fri, May 05, 2006 at 04:46:43PM +0200, Witold Strzelczyk wrote: > > > I have a question about my function. I must get user > rating by game > > > result. This isn't probably a perfect solution but I have > one question > > > about > > > > > > select into inGameRating count(game_result)+1 from users > > > where game_result > inRow.game_result; > > > > > > This query in function results in about 1100 ms. > > > inRow.game_result is a integer 2984 > > > And now if I replace inRow.game_result with integer > > > > > > select into inGameRating count(game_result)+1 from users > > > where game_result > 2984; > > > > > > query results in about 100 ms > > > > > > There is probably a reason for this but can you tell me > about it because > > > I can't fine one > > > > > > My function: > > > > > > create or replace function ttt_result(int,int) returns setof > > > tparent_result language plpgsql volatile as $$ > > > declare > > > inOffset alias for $1; > > > inLimit alias for $2; > > > inRow tparent_result%rowtype; > > > inGameResult int := -1; > > > inGameRating int := -1; > > > begin > > > > > > for inRow in > > > select > > > email,wynik_gra > > > from > > > konkurs_uzytkownik > > > order by wynik_gra desc limit inLimit offset inOffset > > > loop > > > if inGameResult < 0 then -- only for first iteration > > > /* this is fast ~100 ms > > > select into inGameRating > > > count(game_result)+1 from users > > > where game_result > 2984; > > > */ > > > /* even if inRow.game_result = 2984 this is > very slow ~ 1100 ms! > > > select into inGameRating count(game_result)+1 > from users > > > where game_result > inRow.game_result; > > > */ > > > inGameResult := inRow.game_result; > > > end if; > > > > > > if inGameResult > inRow.game_result then > > > inGameRating := inGameRating + 1; > > > end if; > > > > > > inRow.game_rating := inGameRating; > > > inGameResult := inRow.game_result; > > > return next inRow; > > > > > > end loop; > > > return; > > > end; > > > $$; > > > -- > > > Witold Strzelczyk > > > [EMAIL PROTECTED] > > > > > > ---(end of > broadcast)--- > > > TIP 9: In versions below 8.0, the planner will ignore > your desire to > > >choose an index scan if your joining column's > datatypes do not > > >match > > -- > Witold Strzelczyk > > : : D i g i t a l O n e : : http://www.digitalone.pl > : : Dowborczykow 25 Lodz 90-019 Poland > : : tel. [+48 42] 6771477 fax [+48 42] 6771478 > > ...Where Internet works for effective business solutions... > ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Optimizing a huge_table/tiny_table join
On May 25, 2006, at 12:07 PM, Dawid Kuroczko wrote: On 5/25/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: Well, they're not my statistics; they're explain's. You mean there's a bug in explain? I agree that it makes no sense that the costs don't differ as much as one would expect, but you can see right there the numbers of rows for the two tables. At any rate, how would one go about finding an explanation for these strange stats? Well, the query planner uses statistics to deduce the best plan possible. Explain includes this statistical data in its output. See: http://www.postgresql.org/docs/8.1/interactive/planner-stats.html ...for information about what it is all about. The idea is that your statistics are probably not detailed enough to help the planner. See ALTER TABLE SET STATISTICS to change that. http://www.pervasive-postgres.com/lp/newsletters/2006/ Insights_postgres_Mar.asp#4 might also be worth your time to read. Hmm, there is a probability (though statistics are more probable go) that you're using some older version of PostgreSQL, and you're hitting same problem as I did: http://archives.postgresql.org/pgsql-performance/2005-07/msg00345.php Tom has provided back then a patch, which fixed it: http://archives.postgresql.org/pgsql-performance/2005-07/msg00352.php ...but I don't remember when it made into release. According to cvs, it's been in since 8.1 and 8.0.4: Revision 1.111.4.2: download - view: text, markup, annotated - select for diffs Fri Jul 22 19:12:33 2005 UTC (10 months ago) by tgl Branches: REL8_0_STABLE CVS tags: REL8_0_8, REL8_0_7, REL8_0_6, REL8_0_5, REL8_0_4 Diff to: previous 1.111.4.1: preferred, colored; branchpoint 1.111: preferred, colored; next MAIN 1.112: preferred, colored Changes since revision 1.111.4.1: +18 -37 lines Fix compare_fuzzy_path_costs() to behave a bit more sanely. The original coding would ignore startup cost differences of less than 1% of the estimated total cost; which was OK for normal planning but highly not OK if a very small LIMIT was applied afterwards, so that startup cost becomes the name of the game. Instead, compare startup and total costs fuzzily but independently. This changes the plan selected for two queries in the regression tests; adjust expected-output files for resulting changes in row order. Per reports from Dawid Kuroczko and Sam Mason. Revision 1.124: download - view: text, markup, annotated - select for diffs Fri Jul 22 19:12:01 2005 UTC (10 months ago) by tgl Branches: MAIN CVS tags: REL8_1_0BETA3, REL8_1_0BETA2, REL8_1_0BETA1 Diff to: previous 1.123: preferred, colored Changes since revision 1.123: +18 -37 lines Fix compare_fuzzy_path_costs() to behave a bit more sanely. The original coding would ignore startup cost differences of less than 1% of the estimated total cost; which was OK for normal planning but highly not OK if a very small LIMIT was applied afterwards, so that startup cost becomes the name of the game. Instead, compare startup and total costs fuzzily but independently. This changes the plan selected for two queries in the regression tests; adjust expected-output files for resulting changes in row order. Per reports from Dawid Kuroczko and Sam Mason. -- Jim C. Nasby, Database Architect[EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] is it possible to make this faster?
On May 25, 2006, at 4:11 PM, Tom Lane wrote: Tom Lane <[EMAIL PROTECTED]> writes: "Merlin Moncure" <[EMAIL PROTECTED]> writes: recent versions of mysql do much better, returning same set in < 20ms. Well, since they don't do MVCC they can answer this query from the index without going to the heap at all. But that still seems remarkably fast for something that has to grovel through 300k index entries. Are you sure you measured that right? I tried to duplicate this using mysql 5.0.21, and I see runtimes of 0.45 sec without an index and 0.15 sec with. This compares to psql times around 0.175 sec. Doesn't look to me like we're hurting all that badly, even without using the index. Well, that would depend greatly on how wide the rows were, and I don't believe the OP ever mentioned that. If he's got a nice, fat varchar(1024) in that table, then it's not surprising that an index would help things. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
FW: [PERFORM] pg_xlog on data partition with BBU RAID
Forwarding to -performance From: Alan Hodgson [mailto:[EMAIL PROTECTED] On Friday 09 June 2006 12:41, "Jim C. Nasby" <[EMAIL PROTECTED]> wrote: > Has anyone actually done any testing on this? Specifically, I'm > wondering if the benefit of adding 2 more drives to a RAID10 outweighs > whatever penalties there are to having pg_xlog on that RAID10 with all > the rest of the data. I have an external array with 1GB of write-back cache, and testing on it before deployment showed no difference under any workload I could generate between having pg_xlog on a separate RAID-1 or having it share a RAID-10 with the default tablespace. I left it on the RAID-10, and it has been fine there. We have a very write-heavy workload. -- "If a nation expects to be ignorant and free, in a state of civilization, it expects what never was and never will be." -- Thomas Jefferson ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Confirmation of bad query plan generated by 7.4
On Jun 13, 2006, at 8:50 PM, Tom Lane wrote: "Jim C. Nasby" <[EMAIL PROTECTED]> writes: On Tue, Jun 13, 2006 at 06:04:42PM -0400, Tom Lane wrote: It'd depend on the context, possibly, but it's easy to show that the current planner does fold "now() - interval_constant" when making estimates. Simple example: Turns out the difference is between feeding a date vs a timestamp into the query... I would have thought that since date is a date that the WHERE clause would be casted to a date if it was a timestamptz, but I guess not... Hmm ... worksforme. Could you provide a complete test case? decibel=# create table date_test(d date not null, i int not null); CREATE TABLE decibel=# insert into date_test select now()-x*'1 day'::interval, i from generate_series(0,3000) x, generate_series(1,10) i; INSERT 0 30010 decibel=# analyze verbose date_test; INFO: analyzing "decibel.date_test" INFO: "date_test": scanned 3 of 1622163 pages, containing 555 live rows and 0 dead rows; 3 rows in sample, 300100155 estimated total rows ANALYZE decibel=# explain select * from date_test where d >= now()-'15 days'::interval; QUERY PLAN - Seq Scan on date_test (cost=0.00..6873915.80 rows=1228164 width=8) Filter: (d >= (now() - '15 days'::interval)) (2 rows) decibel=# explain select * from date_test where d >= (now()-'15 days'::interval)::date; QUERY PLAN - Seq Scan on date_test (cost=0.00..7624166.20 rows=1306467 width=8) Filter: (d >= ((now() - '15 days'::interval))::date) (2 rows) decibel=# select version(); version - PostgreSQL 8.1.4 on amd64-portbld-freebsd6.0, compiled by GCC cc (GCC) 3.4.4 [FreeBSD] 20050518 (1 row) decibel=# -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Precomputed constants?
On Jun 15, 2006, at 1:19 PM, Zoltan Boszormenyi wrote: # select distinct provolatile from pg_proc; provolatile - i s v (3 sor) If I get this right, IMMUTABLE/STABLE/VOLATILE are indicated with their initials. That's probably correct. If the docs don't specify this then the code would. Or you could just create 3 test functions and see what you end up with, but I can't see it being any different from your guess. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Optimizer internals
On Jun 16, 2006, at 8:43 AM, Jonah H. Harris wrote: Yes, this is certainly the most noticible case. This is one reason I'm behind the freespace patch. Unfortunately, a lot of inexperienced people use VACUUM FULL and don't understand why VACUUM is *generally* better.(to free up block-level freespace and update FSM) assuming they have enough hard disk space for the database. Another reason to turn autovac on by default in 8.2... That and of course the visibility bitmap that has been much-discussed I'd certainly like to see it. What's the hold-up on this? I thought there were some technical issues that had yet to be resolved? BTW, I'll point out that DB2 and MSSQL didn't switch to MVCC until their most recent versions. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] SAN performance mystery
On Jun 16, 2006, at 6:28 AM, Greg Stark wrote: I never understood why disk caches on the order of megabytes are exciting. Why should disk manufacturers be any better about cache management than OS authors? In the case of RAID 5 this could actually work against you since the RAID controller can _only_ use its cache to find parity blocks when writing. Software raid can use all of the OS's disk cache to that end. IIRC some of the Bizgres folks have found better performance with software raid for just that reason. The big advantage HW raid has is that you can do a battery-backed cache, something you'll never be able to duplicate in a general-purpose computer (sure, you could battery-back the DRAM if you really wanted to, but if the kernel crashed you'd be completely screwed, which isn't the case with a battery-backed RAID controller). The quality of the RAID controller also makes a huge difference. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Sun Donated a Sun Fire T2000 to the PostgreSQL community
On Jun 16, 2006, at 12:01 PM, Josh Berkus wrote: Folks, I am thrill to inform you all that Sun has just donated a fully loaded T2000 system to the PostgreSQL community, and it's being setup by Corey Shields at OSL (osuosl.org) and should be online probably early next week. The system has So this system will be hosted by Open Source Lab in Oregon. It's going to be "donated" to Software In the Public Interest, who will own for the PostgreSQL fund. We'll want to figure out a scheduling system to schedule performance and compatibility testing on this machine; I'm not sure exactly how that will work. Suggestions welcome. As a warning, Gavin Sherry and I have a bunch of pending tests already to run. First thing as soon as I have a login, of course, is to set up a Buildfarm instance. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Sun Donated a Sun Fire T2000 to the PostgreSQL community
On Jun 16, 2006, at 12:01 PM, Josh Berkus wrote: First thing as soon as I have a login, of course, is to set up a Buildfarm instance. Keep in mind that buildfarm clients and benchmarking stuff don't usually mix well. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Tuning New Server (slow function)
On Jun 21, 2006, at 5:53 PM, Ron St-Pierre wrote: Jim C. Nasby wrote: On Wed, Jun 21, 2006 at 02:27:41PM -0700, Ron St-Pierre wrote: We just purchased a new Dell PowerEdge 2800 (dual xeon, 8GB RAM, raid 4, RHEL, postgres 8.1) and ported our old database over to it (single cpu, RAID *4*? oops, raid 5 (but we are getting good io throughput...) Just remember that unless you have a really good battery-backed controller, writes to RAID5 pretty much suck. BEGIN TRUNCATE stock.datacount; FOR rec IN SELECT itemID, item, hexValue FROM stock.activeitem LOOP histdate := (SELECT updatedate FROM stock.historical s WHERE s.itemID=rec.itemID ORDER BY updatedate DESC LIMIT 1); IF histdate IS NOT NULL THEN funddate := (SELECT updatedate FROM stock.funddata s WHERE s.itemID=rec.itemID); techdate := (SELECT updatedate FROM stock.techsignals s WHERE s.itemID=rec.itemID); IF (histdate <> funddate) OR (histdate <> techdate) OR (funddate IS NULL) OR (techdate IS NULL) THEN counter := counter + 1; outrec.itemID := rec.itemID; outrec.item := rec.item; outrec.hexvalue := rec.hexvalue; RETURN NEXT outrec; END IF; END IF; END LOOP; INSERT INTO stock.datacount (itemcount) VALUES (counter); COPY stock.datacount TO ''/tmp/datacount''; RETURN; END; How would I rewrite it to do away with the cursor? Something like... SELECT ... FROM (SELECT a, f.updatedate AS funddate, t.updatedate AS techdate, max(updatedate) hist_date FROM activeitem a JOIN historical h USING itemid GROUP BY a, f.updatedate, t.updatedate) AS a LEFT JOIN funddate f USING itemid LEFT JOIN techsignals USING itemid WHERE f.updatedate <> hist_date OR t.updatedate <> hist_date OR f.updatedate IS NULL OR t.updatedate IS NULL ; BTW, there's some trick that would let you include the NULL tests with the other tests in the WHERE, but I can't remember it off the top of my head... "top" shows: CPU states: cpuusernice systemirq softirq iowaitidle total5.8%0.6% 31.2% 0.0% 0.0%0.5% 61.6% Mem: 8152592k av, 8143012k used,9580k free, 0k shrd, 179888k buff The high system % (if I'm reading this correctly) makes me wonder if this is some kind of locking issue. But it's the only postgres process running. Sure, but PostgreSQL still acquires internal locks. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM]
I'm at a client who's an ASP; they've written their app such that each customer gets their own database. Rigth now they're at nearly 200 databases, and were thinking that they "must be the largest PostgreSQL install in the world". :) After taking them down a notch or two, I started wondering how many sites could beat 200 databases in a single cluster. I'm sure there's any number that can, though 200 databases in a cluster certainly isn't mainstream. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] [BUGS] BUG #2567: High IOWAIT
Please cc the list so others can help. How large is the database? What indexes are on the tables you're inserting into? What speed is the drive? Since it's a single SCSI drive I'm assuming it's only 10k RPM, which means the theoretical maximum you can hit is 160 transfers per second. At 40 inserts per second (I'm assuming each insert is it's own transaction), you're already at 40 WAL operations per second, minimum. Plus whatever traffic you have to the data tables. Your biggest win would be to batch those inserts together into transactions, if possible. If not, the commit_delay settings might help you out. There may be some further gains to be had by tweaking the background writer settings; it might be too aggressive in your application. That update statement could also be causing a lot of activity, depending on what it's doing. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 -Original Message- From: Kumarselvan S [mailto:[EMAIL PROTECTED] Sent: Wed 8/9/2006 11:33 PM To: Jim Nasby Subject: RE: [BUGS] BUG #2567: High IOWAIT Yes , it is not a Bug. Here the some Info abt the Hardware It has an SCSI Drive. It an dell made quad processor machine. The changes to Postgresql.conf 1. max_connections =50 2. shared buffer = 3 3. Temp buffer 2 Regards, Kumar -Original Message- From: Jim C. Nasby [mailto:[EMAIL PROTECTED] Sent: Thursday, August 10, 2006 3:57 AM To: kumarselvan Cc: pgsql-performance@postgresql.org Subject: Re: [BUGS] BUG #2567: High IOWAIT This isn't a bug; moving to pgsql-performance. On Tue, Aug 08, 2006 at 08:42:02AM +, kumarselvan wrote: > i have installed the postgres as mentioned in the Install file. it is a 4 > cpu 8 GB Ram Machine installed with Linux Enterprise version 3. when i am > running a load which will perfrom 40 inserts persecond on 2 tables and 10 > updates per 10seconds on differnt table IOWait on avg going upto 70% due to > which i am not able to increase the load. Is there is any other way to > install the postgres on multiprocessor machine.. can any one help me on > this... You haven't given us nearly enough information. What kind of hardware is this? RAID? What changes have you made to postgresql.conf? -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 quad ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Inner Join of the same table
On Aug 15, 2006, at 1:53 PM, Sebastián Baioni wrote: 9 -> Index Scan using uesapt001 on APORTES (cost=0.00..37301678.64 rows=9339331 width=25) (actual time=110.000..2520690.000 rows=9335892 loops=1) It's taking 2520 seconds to scan an index with 9M rows, which sounds way, way too slow. I suspect that index got bloated badly at some point by not vacuuming frequently enough (autovacuum is your friend). Try reindexing and see if that fixes the problem. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Update on high concurrency OLTP application and Postgres
Have you ever done any testing to see if just setting default_statistics_target to 500 has a negative impact on the system? On Sep 22, 2006, at 4:48 PM, Cosimo Streppone wrote: Christian Storm wrote: At the moment, my rule of thumb is to check out the ANALYZE VERBOSE messages to see if all table pages are being scanned. INFO: "mytable": scanned xxx of yyy pages, containing ... If xxx = yyy, then I keep statistics at the current level. When xxx is way less than yyy, I increase the numbers a bit and retry. It's probably primitive, but it seems to work well. > What heuristic do you use to up the statistics for such a table? No heuristics, just try and see. For tables of ~ 10k pages, I set statistics to 100/200. For ~ 100k pages, I set them to 500 or more. I don't know the exact relation. Once you've changed it, what metric do you use to > see if it helps or was effective? I rerun an analyze and see the results... :-) If you mean checking the usefulness, I can see it only under heavy load, if particular db queries run in the order of a few milliseconds. If I see normal queries that take longer and longer, or they even appear in the server's log (> 500 ms), then I know an analyze is needed, or statistics should be set higher. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] IN not handled very well?
On Sep 24, 2006, at 2:12 PM, Ben wrote: Ah, so I do. Thanks, that helps an awful lot. But the plan is still twice as expensive as when I put in the static values. Is it just unreasonable to expect the planner to see that there aren't many rows in the subselect, so to use the bitmap scans after all? Based on your initial post, it probably should know that it's only getting 15 rows (since it did in your initial plan), so it's unclear why it's not choosing the bitmap scan. Can you post the results of EXPLAIN ANALYZE? -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Decreasing BLKSZ
On Sep 26, 2006, at 5:36 PM, Marc Morin wrote: 1- partitions loaded without indexes on them.. And build index "when partition is full". Slow to drill down into incomplete partitions. 2- paritions with index as loaded. Slow, on insert (problem mentioned) but good to drill down How big are your partitions? The number of rows in your active partition will determine how large your indexes are (and probably more importantly, how many levels there are), which will definitely affect your timing. So, you might have better luck with a smaller partition size. I'd definitely try someone else's suggestion of making the PK logtime, key (assuming that you need to enforce uniqueness) and having an extra index on just key. If you don't need to enforce uniqueness, just have one index on key and one on logtime. Or if your partitions are small enough, don't even create the logtime index until the partition isn't being inserted into anymore. If the number of key values is pretty fixed, it'd be an interesting experiment to try partitioning on that, perhaps even with one key per partition (which would allow you to drop the key from the tables entirely, ie: CREATE TABLE stats_1 (logtime PRIMARY KEY, stat1, stat2, stat3); CREATE TABLE stats_2 ... CREATE VIEW stats AS SELECT 1 AS key, * FROM stats_1 UNION ALL SELECT 2, * FROM stats_2 ... I wouldn't put too much work into that as no real effort's been expended to optimize for that case (especially the resulting monster UNION ALL), but you might get lucky. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Confusion and Questions about blocks read
On Sep 23, 2006, at 8:19 AM, Markus Schaber wrote: Btw, would it be feasible to enhance normal index scans by looking at all rows in the current table block whether they meet the query criteria, fetch them all, and blacklist the block for further revisiting during the same index scan? I think that, for non-sorted cases, this could improve index scans a little, but I don't know whether it's worth the effort, given that bitmap indidex scans exist. The trade-off is you'd burn a lot more CPU on those pages. What might be interesting would be collapsing bitmap scan data down to a page level when certain conditions were met, such as if you're getting a significant number of hits for a given page. There's probably other criteria that could be used as well. One issue would be considering the effects of other bitmap index operations; if you're ANDing a bunch of scans together, you're likely to have far fewer tuples per page coming out the backside, which means you probably wouldn't want to burn the extra CPU to do full page scans. BTW, I remember discussion at some point about ordering the results of a bitmap scan by page/tuple ID, which would essentially do what you're talking about. I don't know if it actually happened or not, though. If this is something that interests you, I recommend taking a look at the code; it's generally not too hard to read through thanks to all the comments. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] PostgreSQL and sql-bench
On Sep 25, 2006, at 10:58 AM, yoav x wrote: I am not comparing Postgres to MyISAM (obviously it is not a very fair comparison) and we do need ACID, so all comparison are made against InnoDB (which now supports MVCC as well). I will try again with the suggestions posted here. Make sure that you're not inadvertently disabling ACIDity in MySQL/ InnoDB; some options/performance tweaks will do that last I looked. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] selecting data from information_schema.columns
On Oct 2, 2006, at 7:31 PM, Steve Martin wrote: Regarding, newsysviews, what is the current state, I have had a quick look at the pgFoundry site and the last updates were 9 months ago. Well, the system catalogs don't change terribly often, so it's not like a lot needs to be done. We'd hoped to get them into core, but that didn't pan out. Theoretically, we should be making the views look more like information_schema, but no one's gotten to it yet. The most efficient way in the short term I can find to improve performance for our application is to create a table from information_schema.columns and update it when tables a created or deleted, or columns added or removed. E.g. Well, there's nothing that says you have to use information_schema. You can always query the catalog tables directly. Even if you don't want to use newsysviews as-is, the code there should be very helpful for doing that. There is no ability to put triggers on DDL, so the best you could do with your caching table is to just periodically update it. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] any hope for my big query?
On Oct 4, 2006, at 4:40 PM, Ben wrote: I'm surprised (though probably just because I'm ignorant) that it would have so much sequential scanning in there. For instance, because n is going to have at most a couple dozen rows, it seems that instead of scanning all of public.track, it should be able to convert my "t.length between a and b" clause to some between statements or'd together. Or at least, it would be nice if the planner could do that. That would require the planner having that knowledge at plan-time, which it can't without actually querying the database. One thing that might work wonders is performing the n query ahead of time and then sticking it in an array... that might speed things up. Worst case, you could run the n query, and then run the rest of the query for each row of n you get back. Better yet... send us a patch that allows the planner to look into what a subselect will return to us. ;) -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Unsubscribe
On Oct 4, 2006, at 10:54 AM, Joshua D. Drake wrote: [Joshua] It is ridiculous that this community expects people to read email headers to figure out how to unsubscribe from our lists. I always check the headers when I want to unsubscribe from any mailing list, and I think most people on this list have above average knowledge of such technical details. Of course, on a list with this many recepients there will always be some exceptions ... I would consider myself above average knowledge of such technical details and I didn't know the list information was in the headers until recently (the last time all of this came up). Now, I of course did know that there were headers, and I can use them to diagnose problems but I was unaware of an RFC that explicitly stated how the headers were supposed to be sent for mailing lists. However, that is besides the point. It is still ridiculous to expect anyone to read the headers just to unsubscribe from a list. If we didn't want to add it for each list we could just add a link here: http://www.postgresql.org/community/lists/subscribe An even better option would be to switch to a list manager that actively traps these emails, such as mailman. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Unsubscribe
On Oct 4, 2006, at 11:35 AM, Csaba Nagy wrote: On Wed, 2006-10-04 at 18:02, Csaba Nagy wrote: If we didn't want to add it for each list we could just add a link here: http://www.postgresql.org/community/lists/subscribe OK, now that I had a second look on that page, it does contain unsubscription info... but it's well hidden for the fugitive look... the caption is a big "Subscribe to Lists", you wouldn't think at a first glance think that the form is actually used to unsubscribe too, would you ? So maybe it's just that the text should be more explicit about what it actually does... Better yet, have an unsubscribe page... Personally, I'm tempted to get creative with procmail, and post a recipe that others can use to help enlighten those that post unsubscribe messages to the list... :> -- Jim C. Nasby, Database Architect [EMAIL PROTECTED] 512.569.9461 (cell) http://jim.nasby.net ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] slow queue-like empty table
On Oct 4, 2006, at 5:59 AM, Tobias Brox wrote: [Csaba Nagy - Thu at 10:45:35AM +0200] So you should check for "idle in transaction" sessions, those are bad... or any other long running transaction. Thank you (and others) for pointing this out, you certainly set us on the right track. We did have some few unclosed transactions; transactions not beeing ended by "rollback" or "commit". We've been fixing this, beating up the programmers responsible and continued monitoring. I don't think it's only due to those queue-like tables, we've really seen a significant improvement on the graphs showing load and cpu usage on the database server after we killed all the "idle in transaction". I can safely relax still some weeks before I need to do more optimization work :-) Leaving transactions open for a long time is murder on pretty much any database. It's about one of the worst programming mistakes you can make (from a performance standpoint). Further, mishandling transaction close is a great way to lose data: BEGIN; ...useful work --COMMIT should have happened here ...more work ...ERROR! ROLLBACK; You just lost that useful work. (oh, btw, we didn't really beat up the programmers ... too big geographical distances ;-) This warrants a plane ticket. Seriously. If your app programmers aren't versed in transaction management, you should probably be defining a database API that allows the use of autocommit. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Unsubscribe
On Oct 4, 2006, at 10:47 AM, Nolan Cafferky wrote: People ignorantly posting an unsubscribe to the list get this kind of response because it's an annoyance to the list users, Back in the day, a friend of mine setup an unsubscribe mailing list, with open membership. The idea was if you saw someone post an unsubscribe message, you'd subscribe him to this list. He'd eventually get email from it (which had unsub directions at the end of every message) and then flail around trying to unsubscribe. It made for very entertaining reading until his ISP got tired of the complaints and made him shut it down. -- Jim C. Nasby, Database Architect [EMAIL PROTECTED] 512.569.9461 (cell) http://jim.nasby.net ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Simple join optimized badly?
On Oct 7, 2006, at 8:50 PM, Denis Lussier wrote: Wouldn't PG supporting simple optmizer hints get around this kinda problem? Seems to me that at least one customer posting per week would be solved via the use of simple hints. If the community is interested... EnterpriseDB has added support for a few different simple types of hints (optimize for speed, optimize for first rows, use particular indexes) for our upcoming 8.2 version. We are glad to submit them into the community process if there is any chance they will eventually be accepted for 8.3. +1 (and I'd be voting that way regardless of where my paycheck comes from) While it's important that we continue to improve the planner, it's simply not possible to build one that's smart enough to handle every single situation. -- Jim C. Nasby, Database Architect [EMAIL PROTECTED] 512.569.9461 (cell) http://jim.nasby.net ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Setting "nice" values
On Nov 2, 2006, at 9:14 AM, Madison Kelly wrote: I've got a script (perl, in case it matters) that I need to run once a month to prepare statements. This script queries and updates the database a *lot*. I am not concerned with the performance of the SQL calls so much as I am about the impact it has on the server's load. Is there a way to limit queries speed (ie: set a low 'nice' value on a query)? This might be an odd question, or I could be asking the question the wrong way, but hopefully you the idea. :) The BizGres folks have been working on resource queuing, which will eventually do what you want. Take a look at the BizGres mailing list archives for more info. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Help w/speeding up range queries?
On Oct 31, 2006, at 8:29 PM, Tom Lane wrote: John Major <[EMAIL PROTECTED]> writes: My problem is, I often need to execute searches of tables like these which find "All features within a range". Ie: select FeatureID from SIMPLE_TABLE where FeatureChromosomeName like 'chrX' and StartPosition > 1000500 and EndPosition < 200; A standard btree index is just going to suck for these types of queries; you need something that's actually designed for spatial range queries. You might look at the contrib/seg module --- if you can store your ranges as "seg" datatype then the seg overlap operator expresses what you need to do, and searches on an overlap operator can be handled well by a GIST index. Also, there's the PostGIS stuff, though it might be overkill for what you want. Another possibility (think Tom has suggested in the past) is to define Start and End as a box, and then use the geometric functions built into plain PostgreSQL (though perhaps that's what he meant by "PostGIS stuff"). -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Context switch storm
On Nov 14, 2006, at 1:11 PM, Merlin Moncure wrote: On 11/14/06, Jim C. Nasby <[EMAIL PROTECTED]> wrote: On Tue, Nov 14, 2006 at 09:17:08AM -0500, Merlin Moncure wrote: > On 11/14/06, Cosimo Streppone <[EMAIL PROTECTED]> wrote: > >I must say I lowered "shared_buffers" to 8192, as it was before. > >I tried raising it to 16384, but I can't seem to find a relationship > >between shared_buffers and performance level for this server. > > My findings are pretty much the same here. I don't see any link > between shared buffers and performance. I'm still looking for hard > evidence to rebut this point. Lower shared buffers leaves more > memory for what really matters, which is sorting. It depends on your workload. If you're really sort-heavy, then having memory available for that will be hard to beat. Otherwise, having a large shared_buffers setting can really help cut down on switching back and forth between the kernel and PostgreSQL. BTW, shared_buffers of 16384 is pretty low by today's standards, so that could be why you're not seeing much difference between that and 8192. Try upping it to 1/4 - 1/2 of memory and see if that changes things. Can you think of a good way to construct a test case that would demonstrate the difference? What would be the 'best case' where a high shared buffers would be favored over a low setting? Something that's read-heavy will benefit the most from a large shared_buffers setting, since it means less trips to the kernel. Write-heavy apps won't benefit that much because you'll end up double- buffering written data. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Swapping in 7.4.3
When I went to 7.4.3 (Slackware 9.1) w/ JDBC, the improvements are that it doesn't initially take much memory (have 512M) and didn't swap. I ran a full vaccum and a cluster before installation, however speed degaded to 1 *second* / update of one row in 150 rows of data, within a day! pg_autovacuum now gives excellent performance however it is taking 66M of swap; only 270k cached. ___ Join Excite! - http://www.excite.com The most personalized portal on the Web! ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Swapping in 7.4.3
With pg_autovaccum it's now at 95M swap; averaging 5MB / day increase with same load. Cache slightly increases or decreases according to top. --- On Tue 07/13, Matthew T. O'Connor < [EMAIL PROTECTED] > wrote: From: Matthew T. O'Connor [mailto: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Date: Tue, 13 Jul 2004 16:26:09 -0400 Subject: Re: [PERFORM] Swapping in 7.4.3 Jim Ewert wrote:> When I went to 7.4.3 (Slackware 9.1) w/ JDBC, the improvements are that it doesn't initially take much memory (have 512M) and didn't swap. I ran a full vaccum and a cluster before installation, however speed degaded to 1 *second* / update of one row in 150 rows of data, within a day! pg_autovacuum now gives excellent performance however it is taking 66M of swap; only 270k cached.> Are you saying that your system stays fast now that you are using pg_autovacuum, but pg_autovacuum is using 66M of memory? Please clarify, I'm not sure what question you want an answered.Matthew ___ Join Excite! - http://www.excite.com The most personalized portal on the Web! ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] performance with column orders
I couldn't track down recent info in the archives, so I figured I'd ask here. Does the order of columns still have an impact on table speed? Back in the olden days, it used to be that fixed width columns (integer, tinyint, etc.) should be the first ("left") columns in the table and variable width ones should be towards the end ("right"). This allowed a database to line up the columns better on disk and give you a speed boost. So, does Postgres still care about it? And, if so, how much? The posts I found were from 2 years ago, and indicated that there is a minor increase, but not a lot. Incidentally, could anyone quantify that in any fashion? Thanks, -Jim ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Bulk Insert and Index use
If the bulk load has the possibility of duplicating data, then you need to change methods. Try bulk loading into a temp table, index it like the original, eliminate the dups and merge the tables. It is also possible to do an insert from the temp table into the final table like: insert into original (x,x,x) (select temp.1, temp.2, etc from temp left join original on temp.street=original.street where original.street is null) Good Luck Jim Rudi Starcevic wrote: Hi, I have a question on bulk checking, inserting into a table and how best to use an index for performance. The data I have to work with is a monthly CD Rom csv data dump of 300,000 property owners from one area/shire. So every CD has 300,000 odd lines, each line of data which fills the 'property' table. Beginning with the first CD each line should require one SELECT and one INSERT as it will be the first property with this address. The SELECT uses fields like 'street' and 'suburb', to check for an existing property, so I have built an index on those fields. My question is does each INSERT rebuild the index on the 'street' and 'suburb' fields? I believe it does but I'm asking to be sure. If this is the case I guess performance will suffer when I have, say, 200,000 rows in the table. Would it be like: a) Use index to search on 'street' and 'suburb' b) No result? Insert new record c) Rebuild index on 'street' and 'suburb' for each row? Would this mean that after 200,000 rows each INSERT will require the index of 000's of rows to be re-indexed? So far I believe my only options are to use either and index or sequential scan and see which is faster. A minute for your thoughts and/or suggestions would be great. Thanks. Regards, Rudi. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Help trying to tune query that executes 40x slower
Hugo, I think your problem is with the MRS_TRANSACTION TRANS table. It is not joining anything when declared, but later it is joining thru a LEFT JOIN of the REPL_DATA_OWNER_RSDNC table. In fact I'm not sure that this table is really needed. I would suggest rewriting your FROM clause. It appears a little busy and includes additional filters that are taken care of in the WHERE clause. What are the table layouts and what fields are indexed? Hugo Ferreira wrote: Hi there :-) I'm really, really having trouble with this query... It is a part of, hmmm... 200 similar querys that I dinyamically build and run in a stored procedure. This one, for example, takes 27seconds to run. The whole stored procedure executes in about 15minutes. This is too much when compared to the exact same database, with the same indexes and same data running under SqlServer 2000, which takes 21seconds to run the whole batch. Any help would be extremely appreciated. I've also tried to tune up the configuration insert into MRS_REPLICATION_OUT select 514, 1168, C.contxt_id, C.contxt_elmt_ix, CAST(null as NUMERIC(18)), CAST(null as NUMERIC(18)), CAST(null as NUMERIC(18)), CAST(null as NUMERIC(18)), CAST(null as NUMERIC(18)), null, 1 from c2iedm.CONTXT as P inner join c2iedm.CONTXT_ELMT as C on (P.contxt_id=C.contxt_id) inner join MRS_REPLICATION_OUT as S on S.ent_id=1029 and (CAST(P.contxt_id AS numeric(18)) = S.pk1) inner join MRS_TRANSACTION TRANS on TRANS.trans_id=514 left join NON_REPL_DATA_OWNER NRDO on NRDO.non_repl_data_owner_id=C.owner_id left join REPL_DATA_OWNER_RSDNC RDOR on RDOR.owner_id=C.owner_id and RDOR.rsdnc_node_id=TRANS.recv_node_id left join MRS_REPLICATION_OUT OUT on OUT.trans_id=514 and OUT.ent_id=1168 and ((CAST(C.contxt_id AS numeric(18)) = OUT.pk1 AND CAST(C.contxt_elmt_ix AS numeric(18)) = OUT.pk2)) inner join MRS_TRANSACTION RED_TRANS on TRANS.prov_node_id=RED_TRANS.prov_node_id and TRANS.recv_node_id=RED_TRANS.recv_node_id left join MRS_REPLICATION_OUT RED_OUT on RED_TRANS.cat_code = 'OUT' and RED_TRANS.trans_type in ('X01', 'X02') and RED_TRANS.trans_id=RED_OUT.trans_id where S.age=0 and S.trans_id=514 and (NRDO.non_repl_data_owner_id is null) AND (RDOR.repl_data_owner_id is null) AND (OUT.trans_id is null) AND (RED_OUT.trans_id is null); This kind of inserts generate few rows. Between 8k and 15k for this particular insert, and about 20k for the whole batch. If I try to run a batch to generate about 50k rows, then I'll be stuck here for more that 45h. Compare this to 12minutes when running SqlServer 2000. Here is the result of explain analyze: "Merge Left Join (cost=1338.32..1377.99 rows=45 width=32) (actual time=719.000..26437.000 rows=14862 loops=1)" " Merge Cond: ("outer".trans_id = "inner".trans_id)" " Join Filter: (("outer".cat_code = 'OUT'::bpchar) AND (("outer".trans_type = 'X01'::bpchar) OR ("outer".trans_type = 'X02'::bpchar)))" " Filter: ("inner".trans_id IS NULL)" " -> Sort (cost=1067.36..1067.47 rows=45 width=56) (actual time=719.000..735.000 rows=14862 loops=1)" "Sort Key: red_trans.trans_id" "-> Merge Join (cost=851.66..1066.12 rows=45 width=56) (actual time=407.000..673.000 rows=14862 loops=1)" " Merge Cond: ("outer".recv_node_id = "inner".recv_node_id)" " Join Filter: ("outer".prov_node_id = "inner".prov_node_id)" " -> Nested Loop Left Join (cost=847.14..987.28 rows=3716 width=60) (actual time=407.000..610.000 rows=14862 loops=1)" "Join Filter: ((("outer".contxt_id)::numeric(18,0) = "inner".pk1) AND (("outer".contxt_elmt_ix)::numeric(18,0) = "inner".pk2))" "Filter: ("inner".trans_id IS NULL)" "-> Merge Left Join (cost=718.22..746.87 rows=3716 width=60) (actual time=407.000..563.000 rows=14862 loops=1)" " Merge Cond: (("outer".recv_node_id = "inner".rsdnc_node_id) AND ("outer".owner_id = "inner".owner_id))" " Filter: ("inner".repl_data_owner_id IS NULL)" " -> Sort (cost=717.19..726.48 rows=3716 width=74) (actual time=407.000..423.000 rows=14862 loops=1)" "Sort Key: trans.recv_node_id, c.owner_id" "-> Nested Loop Left Join (cost=1.01..496.84 rows=3716 width=74) (actual time=0.000..312.000 rows=14862 loops=1)" " Join Filter: ("inner".non_repl_data_owner_id = "outer".owner_id)" " Filter: ("inner".non_repl_data_owner_id IS NULL)" " -> Nested Loop (cost=0.00..412.22 rows=3716 width=74) (actual time=0.000..186.000 rows=14862 loops=1)" "-> Seq Scan on mrs_transaction trans (cost=0.00..2.05 rows=1 width=28) (actual time=0.000..0.000 rows=1 loops=1)" " Filter: (trans_id = 514::numeric)" "
Re: [PERFORM] amazon ec2
On May 3, 2011, at 5:39 PM, Greg Smith wrote: > I've also seen over a 20:1 speedup over PostgreSQL by using Greenplum's free > Community Edition server, in situations where its column store + compression > features work well on the data set. That's easiest with an append-only > workload, and the data set needs to fit within the constraints where indexes > on compressed data are useful. But if you fit the use profile it's good at, > you end up with considerable ability to trade-off using more CPU resources to > speed up queries. It effectively increases the amount of data that can be > cached in RAM by a large multiple, and in the EC2 context (where any access > to disk is very slow) it can be quite valuable. FWIW, EnterpriseDB's "InfiniCache" provides the same caching benefit. The way that works is when PG goes to evict a page from shared buffers that page gets compressed and stuffed into a memcache cluster. When PG determines that a given page isn't in shared buffers it will then check that memcache cluster before reading the page from disk. This allows you to cache amounts of data that far exceed the amount of memory you could put in a physical server. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] REINDEX takes half a day (and still not complete!)
On Apr 30, 2011, at 9:34 AM, Kenneth Marshall wrote: >> I suppose that's what I am going to do on a periodic basis from now >> on. There is a lot of DELETE/UPDATE activity. But I wonder if the >> vacuum stuff really should do something that's similar in function? >> What do the high-end enterprise folks do -- surely they can't be >> dumping/restoring every quarter or soor are they? >> >> Anyway, many many thanks to the lovely folks on this list. Much appreciated! >> > > The autovacuum and space management in 9.0 is dramatically more effective > and efficient then that of 8.2. Unless you have an odd corner-case there > really should be no reason for a periodic dump/restore. This is not your > grandmother's Oldsmobile... :) In 10+ years of using Postgres, I've never come across a case where you actually *need* to dump and restore on a regular basis. However, you can certainly run into scenarios where vacuum simply can't keep up. If your restored database is 1/3 the size of the original then this is certainly what was happening on your 8.2 setup. As Kenneth mentioned, 9.0 is far better in this regard than 8.2, though it's still possible that you're doing something that will give it fits. I suggest that you run a weekly vacuumdb -av, capture that output and run it through pgFouine. That will give you a ton of useful information about the amount of bloat you have in each table. I would definitely look at anything with over 20% bloat. BTW, in case you're still questioning using Postgres in an enterprise setting; all of our production OLTP databases run on Postgres. The largest one is ~1.5TB and does over 650TPS on average (with peaks that are much higher). Unplanned downtime on that database would cost us well over $100k/hour, and we're storing financial information, so data quality issues are not an option (data quality was one of the primary reasons we moved away from MySQL in 2006). So yes, you can absolutely run very large Postgres databases in a high-workload environment. BTW, that's also on version 8.3. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Shouldn't we have a way to avoid "risky" plans?
On Mar 24, 2011, at 5:23 PM, Claudio Freire wrote: > I routinely have to work around query inefficiencies because GEQO does > something odd - and since postgres gives me too few tools to tweak > plans (increase statistics, use subqueries, rephrase joins, no direct > tool before CTEs which are rather new), it becomes an art form, and it > becomes very unpredictable and an administrative burden. Out of the > blue, statistics change, queries that worked fine start to perform > poorly, and sites go down. > > If GEQO could detect unsafe plans and work around them automatically, > it would be a major improvement. This isn't limited to GEQO queries either. Every few months we'll have what should be a very fast query suddenly become far slower. Still on the order of seconds, but when you're running several of those a second and they normally take fractions of a second, this kind of performance degradation can easily bring a server to it's knees. Every time this has happened the solution has been to re-analyze a fairly large table; even with default stats target of 1000 it's very easy for one bad analyze to ruin your day. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] reducing random_page_cost from 4 to 2 to force index scan
On May 16, 2011, at 10:46 AM, Tom Lane wrote: > Robert Haas writes: >> On Mon, May 16, 2011 at 12:49 AM, Jesper Krogh wrote: >>> Ok, it may not work as well with index'es, since having 1% in cache may very >>> well mean that 90% of all requested blocks are there.. for tables in should >>> be more trivial. > >> Tables can have hot spots, too. Consider a table that holds calendar >> reservations. Reservations can be inserted, updated, deleted. But >> typically, the most recent data will be what is most actively >> modified, and the older data will be relatively more (though not >> completely) static, and less frequently accessed. Such examples are >> common in many real-world applications. > > Yes. I'm not convinced that measuring the fraction of a table or index > that's in cache is really going to help us much. Historical cache hit > rates might be useful, but only to the extent that the incoming query > has a similar access pattern to those in the (recent?) past. It's not > an easy problem. > > I almost wonder if we should not try to measure this at all, but instead > let the DBA set a per-table or per-index number to use, analogous to the > override we added recently for column n-distinct statistics ... I think the challenge there would be how to define the scope of the hot-spot. Is it the last X pages? Last X serial values? Something like correlation? Hmm... it would be interesting if we had average relation access times for each stats bucket on a per-column basis; that would give the planner a better idea of how much IO overhead there would be for a given WHERE clause. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] KVP table vs. hstore - hstore performance (Was: Postgres NoSQL emulation)
On May 16, 2011, at 8:47 AM, Merlin Moncure wrote: > On Sat, May 14, 2011 at 5:10 AM, Stefan Keller wrote: >> Hi, >> >> I am conducting a benchmark to compare KVP table vs. hstore and got >> bad hstore performance results when the no. of records is greater than >> about 500'000. >> >> CREATE TABLE kvp ( id SERIAL PRIMARY KEY, key text NOT NULL, value text ); >> -- with index on key >> CREATE TABLE myhstore ( id SERIAL PRIMARY KEY, obj hstore NOT NULL ); >> -- with GIST index on obj >> >> Does anyone have experience with that? > > hstore is not really designed for large-ish sets like that. And KVP is? ;) IIRC hstore ends up just storing everything as text, with pointers to know where things start and end. There's no real indexing inside hstore, so basically the only thing it can do is scan the entire hstore. That said, I would strongly reconsider using KVP for anything except the most trivial of data sets. It is *extremely* inefficient. Do you really have absolutely no idea what *any* of your keys will be? Even if you need to support a certain amount of non-deterministic stuff, I would put everything you possibly can into real fields and only use KVP or hstore for things that you really didn't anticipate. Keep in mind that for every *value*, your overhead is 24 bytes for the heap header, 2+ varlena bytes in the heap, plus the length of the key. In the index you're looking at 6+ bytes of overhead, 1+ byte for varlena, plus the length of the key. The PK will cost you an additional 16-24 bytes, depending on alignment. So that's a *minimum* of ~50 bytes per value, and realistically the overhead will be closer to 65-70 bytes, *per value*. Unless your values are decent-sized strings, the overhead is going to be many times larger than the actual data! -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] reducing random_page_cost from 4 to 2 to force index scan
On May 19, 2011, at 9:53 AM, Robert Haas wrote: > On Wed, May 18, 2011 at 11:00 PM, Greg Smith wrote: >> Jim Nasby wrote: >>> I think the challenge there would be how to define the scope of the >>> hot-spot. Is it the last X pages? Last X serial values? Something like >>> correlation? >>> >>> Hmm... it would be interesting if we had average relation access times for >>> each stats bucket on a per-column basis; that would give the planner a >>> better idea of how much IO overhead there would be for a given WHERE clause >> >> You've already given one reasonable first answer to your question here. If >> you defined a usage counter for each histogram bucket, and incremented that >> each time something from it was touched, that could lead to a very rough way >> to determine access distribution. Compute a ratio of the counts in those >> buckets, then have an estimate of the total cached percentage; multiplying >> the two will give you an idea how much of that specific bucket might be in >> memory. It's not perfect, and you need to incorporate some sort of aging >> method to it (probably weighted average based), but the basic idea could >> work. > > Maybe I'm missing something here, but it seems like that would be > nightmarishly slow. Every time you read a tuple, you'd have to look > at every column of the tuple and determine which histogram bucket it > was in (or, presumably, which MCV it is, since those aren't included > in working out the histogram buckets). That seems like it would slow > down a sequential scan by at least 10x. You definitely couldn't do it real-time. But you might be able to copy the tuple somewhere and have a background process do the analysis. That said, it might be more productive to know what blocks are available in memory and use correlation to guesstimate whether a particular query will need hot or cold blocks. Or perhaps we create a different structure that lets you track the distribution of each column linearly through the table; something more sophisticated than just using correlation perhaps something like indicating which stats bucket was most prevalent in each block/range of blocks in a table. That information would allow you to estimate exactly what blocks in the table you're likely to need... -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] seq scan in the case of max() on the primary key column
On Jun 16, 2011, at 1:36 PM, Shaun Thomas wrote: > /** > * Return the Maximum INT Value for a Partitioned Table Column > * > * @param string Name of Schema of the base partition table. > * @param string Name of the base partition table. > * @param string Name of column to search. > */ > CREATE OR REPLACE FUNCTION spc_max_part_int(VARCHAR, VARCHAR, VARCHAR) > RETURNS INT AS > $$ > DECLARE > > SELECT INTO nParent t.oid >FROM pg_class t >JOIN pg_namespace n ON (t.relnamespace=n.oid) > WHERE n.nspname = sSchema > AND t.relname = sTable; FWIW, instead of that, I would do this: CREATE FUNCTION ...( p_parent_schema text , p_parent_table text ) ... DECLARE c_parent_oid CONSTANT oid := (p_parent_schema || '.' || p_parent_table )::regclass; ... or ... CREATE FUNCTION( p_parent text ) DECLARE c_parent_oid CONSTANT oid := p_parent::regclass; Advantages: - ::regclass is search_path-aware, so you're not forced into providing a schema if you don't want to - it will throw an error if it doesn't find a regclass entry - you can cast the oid back to text: EXECUTE 'SELECT max(' ... 'FROM ' || c_parent_oid::regclass - you can also query directly with the OID: SELECT relkind = 't' AS is_table FROM pg_class WHERE oid = c_parent_oid -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Infinite Cache
On Jul 1, 2011, at 9:43 AM, Anthony Presley wrote: > Was curious if there was some sort of Open Source version of Infinite Cache, > and/or a memcache layer that can be "dropped" in front of PostgreSQL without > application changes (which seems to be the "key" piece of Infinite Cache), or > is this something that EnterpriseDB owns and you have to buy their version of > the software to use? There had been some talk at one point about getting the backend-changes to support Infinite Cache into mainline Postgres. If that ever happened you could build your own version of it. BTW, thanks to the compression feature of IC I've heard it can actually be beneficial to run it on the same server. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] index not being used when variable is sent
On Aug 17, 2011, at 1:49 AM, Eyal Wilde wrote: > 1. is there any more elegant solution? Very possibly, but I'm having a heck of a time trying to figure out what your current code is actually doing. What's the actual problem you're trying to solve here? -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Need to tune for Heavy Write
On Aug 4, 2011, at 10:07 AM, Scott Marlowe wrote: > On Thu, Aug 4, 2011 at 7:57 AM, Kevin Grittner > wrote: >>> RAM : 16 GB >> >>> effective_cache_size = 4096MB >> >> That should probably be more like 12GB to 15GB. It probably won't >> affect the load time here, but could affect other queries. > > Actually on a heavily written database a large effective cache size > makes things slower. effective_cache_size or shared_buffers? I can see why a large shared_buffers could cause problems, but what effect does effective_cache_size have on a write workload? -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance