Re: [PERFORM] Sun Donated a Sun Fire T2000 to the PostgreSQL community
Arjen van der Meijden wrote: I can already confirm very good scalability (with our workload) on postgresql on that machine. We've been testing a 32thread/16G-version and it shows near-linear scaling when enabling 1, 2, 4, 6 and 8 cores (with all four threads enabled). The threads are a bit less scalable, but still pretty good. Enabling 1, 2 or 4 threads for each core yields resp 60 and 130% extra performance. Wow, what type of workload is it? And did you do much tuning to get near-linear scalability to 32 threads? Regards, -Robert ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Sun Donated a Sun Fire T2000 to the PostgreSQL
> Tatsuo Ishii <[EMAIL PROTECTED]> writes: > > Interesting. We (some Japanese companies including SRA OSS, > > Inc. Japan) did some PG scalability testing using a Unisys's big 16 > > (physical) CPU machine and found PG scales up to 8 CPUs. However > > beyond 8 CPU PG does not scale anymore. The result can be viewed at > > "OSS iPedia" web site (http://ossipedia.ipa.go.jp). Our conclusion was > > PG has a serious lock contention problem in the environment by > > analyzing the oprofile result. > > 18% in s_lock is definitely bad :-(. Were you able to determine which > LWLock(s) are accounting for the contention? Yes. We were interested in that too. Some people did addtional tests to determin that. I don't have the report handy now. I will report back next week. > The test case seems to be spending a remarkable amount of time in LIKE > comparisons, too. That probably is not a representative condition. I know. I think point is 18% in s_lock only appears with 12 CPUs or more. -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Sun Donated a Sun Fire T2000 to the PostgreSQL
Tatsuo Ishii <[EMAIL PROTECTED]> writes: > Interesting. We (some Japanese companies including SRA OSS, > Inc. Japan) did some PG scalability testing using a Unisys's big 16 > (physical) CPU machine and found PG scales up to 8 CPUs. However > beyond 8 CPU PG does not scale anymore. The result can be viewed at > "OSS iPedia" web site (http://ossipedia.ipa.go.jp). Our conclusion was > PG has a serious lock contention problem in the environment by > analyzing the oprofile result. 18% in s_lock is definitely bad :-(. Were you able to determine which LWLock(s) are accounting for the contention? The test case seems to be spending a remarkable amount of time in LIKE comparisons, too. That probably is not a representative condition. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Sun Donated a Sun Fire T2000 to the PostgreSQL
> I am thrill to inform you all that Sun has just donated a fully loaded > T2000 system to the PostgreSQL community, and it's being setup by Corey > Shields at OSL (osuosl.org) and should be online probably early next > week. The system has > > * 8 cores, 4 hw threads/core @ 1.2 GHz. Solaris sees the system as > having 32 virtual CPUs, and each can be enabled or disabled individually > * 32 GB of DDR2 SDRAM memory > * 2 @ 73GB internal SAS drives (1 RPM) > * 4 Gigabit ethernet ports > > For complete spec, visit > http://www.sun.com/servers/coolthreads/t2000/specifications.jsp > > I think this system is well suited for PG scalability testing, among > others. We did an informal test using an internal OLTP benchmark and > noticed that PG can scale to around 8 CPUs. Would be really cool if all > 32 virtual CPUs can be utilized!!! Interesting. We (some Japanese companies including SRA OSS, Inc. Japan) did some PG scalability testing using a Unisys's big 16 (physical) CPU machine and found PG scales up to 8 CPUs. However beyond 8 CPU PG does not scale anymore. The result can be viewed at "OSS iPedia" web site (http://ossipedia.ipa.go.jp). Our conclusion was PG has a serious lock contention problem in the environment by analyzing the oprofile result. You can take a look at the detailed report at: http://ossipedia.ipa.go.jp/capacity/EV0604210111/ (unfortunately only Japanese contents is available at the moment. Please use some automatic translation services) Evalution environment was: PostgreSQL 8.1.2 OSDL DBT-1 2.1 Miracle Linux 4.0 Unisys ES700 Xeon 2.8GHz CPU x 16 Mem 16GB(HT off) -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PERFORM] Sun Donated a Sun Fire T2000 to the PostgreSQL community
Arjen, > I can already confirm very good scalability (with our workload) on > postgresql on that machine. We've been testing a 32thread/16G-version > and it shows near-linear scaling when enabling 1, 2, 4, 6 and 8 cores > (with all four threads enabled). Keen. We're trying to keep the linear scaling going up to 32 cores of course (which doesn't happen, presently). Would you be interested in helping us troubleshoot some of the performance issues? -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Sun Donated a Sun Fire T2000 to the PostgreSQL community
On 16-6-2006 17:18, Robert Lor wrote: I think this system is well suited for PG scalability testing, among others. We did an informal test using an internal OLTP benchmark and noticed that PG can scale to around 8 CPUs. Would be really cool if all 32 virtual CPUs can be utilized!!! I can already confirm very good scalability (with our workload) on postgresql on that machine. We've been testing a 32thread/16G-version and it shows near-linear scaling when enabling 1, 2, 4, 6 and 8 cores (with all four threads enabled). The threads are a bit less scalable, but still pretty good. Enabling 1, 2 or 4 threads for each core yields resp 60 and 130% extra performance. Best regards, Arjen ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Question about clustering multiple columns
On Fri, Jun 16, 2006 at 11:11:59 -0700, Benjamin Arai <[EMAIL PROTECTED]> wrote: > Hi, > > Thanks for the reply. I have one more question. Does it matter in which > order that I make the index? Please keep replies copied to the lists so that other people can learn from and crontibute to the discussion. In this case I am just going to copy back to the performance list, since it is generally better for perfomance questions than the general list. > For example, should I create an index cusip,date or date,cusip, does it > matter which order. My goal is to cluster the entries by cusip, then for > each cusip order the data by date (maybe the order by data occurs > automatically). Hm, in that case maybe I only need to cluster by cusip, but > then how do I ensure that each cusip had its data ordered by date? I think that you want to order by cusip (assuming that corresponds to "name" in you sample query below) first. You won't end up having to go through values in the index that will be filtered out if you do it that way. The documentation for the cluster command says that it clusters on indexes, not columns. So if the index is on (cusip, date), then the records will be ordered by cusip, date immediately after the cluster. (New records added after the cluster are not guarenteed to be ordered by the index.) > > Benjamin > > -Original Message- > From: Bruno Wolff III [mailto:[EMAIL PROTECTED] > Sent: Friday, June 16, 2006 8:32 AM > To: Benjamin Arai > Cc: pgsql-general@postgresql.org; pgsql-performance@postgresql.org > Subject: Re: Question about clustering multiple columns > > On Tue, Jun 13, 2006 at 09:04:15 -0700, > Benjamin Arai <[EMAIL PROTECTED]> wrote: > > Hi, > > > > I have a database where there are three columns (name,date,data). The > > queries are almost always something like SELECT date,data FROM table > > WHERE name=blah AND date > 1/1/2005 AND date < 1/1/2006;. I currently > > have three B-tree indexes, one for each of the columns. Is clustering > > on date index going to be what I want, or do I need a index that > > contains both name and date? > > I would expect that clustering on the name would be better for the above > query. > You probably want an index on name and date combined. > > !DSPAM:4492ce0d180368658827628! > ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] SAN performance mystery
On Jun 16, 2006, at 5:11 AM, Tim Allen wrote: One curious thing is that some postgres backends seem to spend an inordinate amount of time in uninterruptible iowait state. I found a posting to this list from December 2004 from someone who reported that very same thing. For example, bringing down postgres on the customer box requires kill -9, because there are invariably one or two processes so deeply uninterruptible as to not respond to a politer signal. That indicates something not quite right, doesn't it? Sounds like there could be a driver/array/kernel bug there that is kicking the performance down the tube. If it was PG's fault it wouldn't be stuck uninterruptable. -- Jeff Trout <[EMAIL PROTECTED]> http://www.dellsmartexitin.com/ http://www.stuarthamm.net/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Sun Donated a Sun Fire T2000 to the PostgreSQL community
Folks, > I am thrill to inform you all that Sun has just donated a fully loaded > T2000 system to the PostgreSQL community, and it's being setup by Corey > Shields at OSL (osuosl.org) and should be online probably early next > week. The system has So this system will be hosted by Open Source Lab in Oregon. It's going to be "donated" to Software In the Public Interest, who will own for the PostgreSQL fund. We'll want to figure out a scheduling system to schedule performance and compatibility testing on this machine; I'm not sure exactly how that will work. Suggestions welcome. As a warning, Gavin Sherry and I have a bunch of pending tests already to run. First thing as soon as I have a login, of course, is to set up a Buildfarm instance. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Question about clustering multiple columns
On Tue, Jun 13, 2006 at 09:04:15 -0700, Benjamin Arai <[EMAIL PROTECTED]> wrote: > Hi, > > I have a database where there are three columns (name,date,data). The > queries are almost always something like SELECT date,data FROM table WHERE > name=blah AND date > 1/1/2005 AND date < 1/1/2006;. I currently have three > B-tree indexes, one for each of the columns. Is clustering on date index > going to be what I want, or do I need a index that contains both name and > date? I would expect that clustering on the name would be better for the above query. You probably want an index on name and date combined. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Sun Donated a Sun Fire T2000 to the PostgreSQL community
I am thrill to inform you all that Sun has just donated a fully loaded T2000 system to the PostgreSQL community, and it's being setup by Corey Shields at OSL (osuosl.org) and should be online probably early next week. The system has * 8 cores, 4 hw threads/core @ 1.2 GHz. Solaris sees the system as having 32 virtual CPUs, and each can be enabled or disabled individually * 32 GB of DDR2 SDRAM memory * 2 @ 73GB internal SAS drives (1 RPM) * 4 Gigabit ethernet ports For complete spec, visit http://www.sun.com/servers/coolthreads/t2000/specifications.jsp I think this system is well suited for PG scalability testing, among others. We did an informal test using an internal OLTP benchmark and noticed that PG can scale to around 8 CPUs. Would be really cool if all 32 virtual CPUs can be utilized!!! Anyways, if you need to access the system for testing purposes, please contact Josh Berkus. Regards, Robert Lor Sun Microsystems, Inc. 01-510-574-7189 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] SAN performance mystery
On 6/16/06, Mikael Carneholm <[EMAIL PROTECTED]> wrote: We've seen similar results with our EMC CX200 (fully equipped) when compared to a single (1) SCSI disk machine. For sequential reads/writes (import, export, updates on 5-10 30M+ row tables), performance is downright awful. A big DB update took 5-6h in pre-prod (single SCSI), and 10-14?h (don't recall the exact details) in production (EMC SAN). And this was with a proprietary DB, btw - no fsync on/off affecting the results here. You are in good company. We bought a Hitachi AMS200, 2gb FC and a gigabyte of cache. We were shocked and dismayed to find the unit could do about 50 mb/sec measured from dd (yes, around the performance of a single consumer grade sata drive). It is my (unconfirmted) belief that the unit was governed internally to encourage you to buy the more expensive version, AMS500, etc. needless to say, we sent the unit back, and are now waiting on a xyratex 4gb FC attached SAS unit. we spoke directly to their performance people who told us to expect the unit to be network bandwitdh bottlenecked as you would expect. they were even talking about a special mode where you could bond the dual fc ports, now that's power. If the unit really does what they claim, I will be back here talking about it for sure ;) The bottom line is that most SANs, even from some of the biggest vendors, are simply worthless from a performance angle. You have to be really critical when you buy them, don't beleive anything the sales rep tells you, and make sure to negotiate in advance a return policy if the unit does not perform. There is tons of b.s. out there, but so far my impression of xyratex is really favorable (fingers crossed), and I'm hearing lots of great stuff about them from the channel. merlin ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Optimizer internals
On 16 Jun 2006 09:21:01 -0400, Greg Stark <[EMAIL PROTECTED]> wrote: Well Oracle has to do almost all that same work, it's just doing it in a separate place called a rollback segment. Well, it's not really the same work. The process by which Oracle manages UNDO is actually pretty simple and efficient, but complex in its implementation. There has also been some significant performance improvements in this area in both 9i and 10g. There are pros and cons especially where it comes to indexes, but also where it comes to what happens when the new record is larger than the old one. Certainly, you want to avoid row chaining at all costs; which is why PCTFREE is there. I have researched update-in-place for PostgreSQL and can avoid row-chaining... so I think we can get the same benefit without the management and administration cost. IMHO the biggest problem Postgres has is when you're updating a lot of records in a table with little free space. Yes, this is certainly the most noticible case. This is one reason I'm behind the freespace patch. Unfortunately, a lot of inexperienced people use VACUUM FULL and don't understand why VACUUM is *generally* better.(to free up block-level freespace and update FSM) assuming they have enough hard disk space for the database. That and of course the visibility bitmap that has been much-discussed I'd certainly like to see it. I wouldnt' say the benchmarks are flawed but I also don't think you can point to any specific design feature and say it's essential just on the basis of bottom-line results. You have to look at the actual benefit the specific wins. True. -- Jonah H. Harris, Software Architect | phone: 732.331.1300 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Delete operation VERY slow...
David Leangen <[EMAIL PROTECTED]> writes: > The only inconvenience is that I need to remove all the foreign key > constraints before truncating, then put them back after. I was about to ask if you had any. Usually the reason for DELETE being slow is that you have foreign key references to (not from) the table and the referencing columns aren't indexed. This forces a seqscan search of the referencing table for each row deleted :-( regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Optimizer internals
"Jonah H. Harris" <[EMAIL PROTECTED]> writes: > Now, if we're considering UPDATES (the worst case for PostgreSQL's > current MVCC architecture), then this is (IMHO) a true statement. > There aren't many *successful* commercial databases that incur the > additional overhead of creating another version of the record, marking > the old one as having been updated, inserting N-number of new index > entries to point to said record, and having to WAL-log all > aforementioned changes. Well Oracle has to do almost all that same work, it's just doing it in a separate place called a rollback segment. There are pros and cons especially where it comes to indexes, but also where it comes to what happens when the new record is larger than the old one. > I've done a good amount of research on enhancing PostgreSQL's MVCC in UPDATE > conditions and believe there is a nice happy medium for us. IMHO the biggest problem Postgres has is when you're updating a lot of records in a table with little free space. Postgres has to keep jumping back and forth between the old records it's reading in and the new records it's writing out. That can in theory turn a simple linear update scan into a O(n^2) operation. In practice read-ahead and caching should help but I'm not clear to what extent. That and of course the visibility bitmap that has been much-discussed that might make vacuum not have to visit every page and allow index scans to skip checking visibility info for some pages would be major wins. > /me waits for the obligatory and predictable, "the benchmarks are > flawed" response. I wouldnt' say the benchmarks are flawed but I also don't think you can point to any specific design feature and say it's essential just on the basis of bottom-line results. You have to look at the actual benefit the specific wins. Oracle and the others all implement tons of features intended to optimize applications like the benchmarks (and the benchmarks specifically of course:) that have huge effects on the results. Partitioned tables, materialized views, etc allow algorithmic improvements that do much more than any low level optimizations can do. -- greg ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Optimizer internals
On 16 Jun 2006 07:23:26 -0400, Greg Stark <[EMAIL PROTECTED]> wrote: The flip side is that Oracle and others like it have to do a lot of extra footwork to do if you query data that hasn't been committed yet. That footwork has performance implications. Not disagreeing here at all, but considering that Oracle, DB2, and SQL Server, et al have proven themselves to perform extremely well under heavy load (in multiple benchmarks), the overhead of an UNDO implementation has a calculable break even point. Feel free to debate it, but the optimistic approach adopted by nearly every commercial database vendor is *generally* a better approach for OLTP. Consider Weikum & Vossen (p. 442): We also need to consider the extra work that the recovery algorithm incurs during normal operation. This is exactly the catch with the class of no-undo/no-redo algorithms. By and large, they come at the expense of a substantial overhead during normal operations that may increase the execution cost per transaction by a factor of two or even higher. In other words, it reduces the achievable transaction throughput of a given server configuration by a factor of two or more. Now, if we're considering UPDATES (the worst case for PostgreSQL's current MVCC architecture), then this is (IMHO) a true statement. There aren't many *successful* commercial databases that incur the additional overhead of creating another version of the record, marking the old one as having been updated, inserting N-number of new index entries to point to said record, and having to WAL-log all aforementioned changes. I have yet to see any successful commercial RDBMS using some sort of no-undo algorithm that doesn't follow the, "factor of two or more" performance reduction. However, if you consider an INSERT or DELETE in PostgreSQL, those are implemented much better than in most commercial database systems due to PostgreSQL's MVCC design. I've done a good amount of research on enhancing PostgreSQL's MVCC in UPDATE conditions and believe there is a nice happy medium for us. /me waits for the obligatory and predictable, "the benchmarks are flawed" response. -- Jonah H. Harris, Software Architect | phone: 732.331.1300 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] how to partition disks
On Wed, Jun 14, 2006 at 04:32:23PM +0200, Sven Geisler wrote: For example, You run two queries with two clients and each queries needs to read some indices from disk. In this case it more efficient to read from different volumes than to read from one large volume where the disc arms has to jump. Hmm. Bad example, IMO. In the case of reading indices you're doing random IO and the heads will be jumping all over the place anyway. The limiting factor there will be seeks/s, and you'll possibly get better results with the larger array. (That case is fairly complicated to analyze and depends very much on the data.) Where multiple arrays will be faster is if you have a lot of sequential IO--in fact, a couple of cheap disks can blow away a fairly expensive array for purely sequential operations since each disk can handle >60MB/s of if it doesn't have to seek, whereas multiple sequential streams on the big array will cause each disk in the array to seek. (The array controller will try to hide this with its cache; its cache size & software will determine how successful it is at doing so.) Mike Stone ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] SAN performance mystery
We've seen similar results with our EMC CX200 (fully equipped) when compared to a single (1) SCSI disk machine. For sequential reads/writes (import, export, updates on 5-10 30M+ row tables), performance is downright awful. A big DB update took 5-6h in pre-prod (single SCSI), and 10-14?h (don't recall the exact details) in production (EMC SAN). And this was with a proprietary DB, btw - no fsync on/off affecting the results here. FC isn't exactly known for great bandwidth, iirc a 2Gbit FC channel tops at 192Mb/s. So, especially if you mostly have DW/BI type of workloads, go for DAD (Direct Attached Disks) instead. /Mikael -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tim Allen Sent: den 15 juni 2006 23:50 To: pgsql-performance@postgresql.org Subject: [PERFORM] SAN performance mystery We have a customer who are having performance problems. They have a large (36G+) postgres 8.1.3 database installed on an 8-way opteron with 8G RAM, attached to an EMC SAN via fibre-channel (I don't have details of the EMC SAN model, or the type of fibre-channel card at the moment). They're running RedHat ES3 (which means a 2.4.something Linux kernel). They are unhappy about their query performance. We've been doing various things to try to work out what we can do. One thing that has been apparent is that autovacuum has not been able to keep the database sufficiently tamed. A pg_dump/pg_restore cycle reduced the total database size from 81G to 36G. Performing the restore took about 23 hours. We tried restoring the pg_dump output to one of our machines, a dual-core pentium D with a single SATA disk, no raid, I forget how much RAM but definitely much less than 8G. The restore took five hours. So it would seem that our machine, which on paper should be far less impressive than the customer's box, does more than four times the I/O performance. To simplify greatly - single local SATA disk beats EMC SAN by factor of four. Is that expected performance, anyone? It doesn't sound right to me. Does anyone have any clues about what might be going on? Buggy kernel drivers? Buggy kernel, come to think of it? Does a SAN just not provide adequate performance for a large database? I'd be grateful for any clues anyone can offer, Tim ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] SAN performance mystery
"Alex Turner" <[EMAIL PROTECTED]> writes: > Given the fact that most SATA drives have only an 8MB cache, and your RAID > controller should have at least 64MB, I would argue that the system with the > RAID controller should always be faster. If it's not, you're getting > short-changed somewhere, which is typical on linux, because the drivers just > aren't there for a great many controllers that are out there. Alternatively Linux is using the 1-4 gigabytes of cache available to it effectively enough that the 64 megabytes of mostly duplicated cache just isn't especially helpful... I never understood why disk caches on the order of megabytes are exciting. Why should disk manufacturers be any better about cache management than OS authors? In the case of RAID 5 this could actually work against you since the RAID controller can _only_ use its cache to find parity blocks when writing. Software raid can use all of the OS's disk cache to that end. -- greg ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Optimizer internals
Mark Lewis <[EMAIL PROTECTED]> writes: > On Thu, 2006-06-15 at 14:05 -0400, John Vincent wrote: > > Now I've been told by our DBA that we should have been able to wholy > > satisfy that query via the indexes. > > DB2 can satisfy the query using only indexes because DB2 doesn't do > MVCC. Well it's more subtle than that. DB2 most certainly does provide MVCC semantics as does Oracle and MSSQL and any other serious SQL implementation. But there are different ways to implement MVCC and every database makes decisions that have pros and cons. Postgres's implementation has some big benefits over others (no rollback segments, no expensive recovery operations, fast inserts and updates) but it also has disadvantages (periodic vacuums and indexes don't cover the data). The distinction you're looking for here is sometimes called "optimistic" versus "pessimistic" space management. (Not locking, that's something else.) Postgres is "pessimistic" -- treats every transaction as if it might be rolled back. Oracle and most others are "optimistic" assumes every transaction will be committed and stores information elsewhere to implement MVCC And recover in case it's rolled back. The flip side is that Oracle and others like it have to do a lot of extra footwork to do if you query data that hasn't been committed yet. That footwork has performance implications. -- greg ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Delete operation VERY slow...
Wow! That was almost instantaneous. I can't believe the difference. The only inconvenience is that I need to remove all the foreign key constraints before truncating, then put them back after. But I suppose it is a small price to pay for this incredible optimization. In that case, your DELETE might have been slowed down by foreign key checks. Suppose you have tables A and B, and table A has a column "b_id REFERENCES B(id)" When you delete from B postgres has to lookup in A which rows reference the deleted rows in order to do the ON DELETE action you specified in the constraint. If you do not have an index on b_id, this can be quite slow... so you should check if your foreign key relations that need indexes have them. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] SAN performance mystery
Tim Allen wrote: We have a customer who are having performance problems. They have a large (36G+) postgres 8.1.3 database installed on an 8-way opteron with 8G RAM, attached to an EMC SAN via fibre-channel (I don't have details of the EMC SAN model, or the type of fibre-channel card at the moment). They're running RedHat ES3 (which means a 2.4.something Linux kernel). To simplify greatly - single local SATA disk beats EMC SAN by factor of four. Is that expected performance, anyone? It doesn't sound right to me. Does anyone have any clues about what might be going on? Buggy kernel drivers? Buggy kernel, come to think of it? Does a SAN just not provide adequate performance for a large database? I'd be grateful for any clues anyone can offer, Tim Thanks to all who have replied so far. I've learned a few new things in the meantime. Firstly, the fibrechannel card is an Emulex LP1050. The customer seems to have rather old drivers for it, so I have recommended that they upgrade asap. I've also suggested they might like to upgrade their kernel to something recent too (eg upgrade to RHEL4), but no telling whether they'll accept that recommendation. The fact that SATA drives are wont to lie about write completion, which several posters have pointed out, presumably has an effect on write performance (ie apparent write performance is increased at the cost of an increased risk of data-loss), but, again presumably, not much of an effect on read performance. After loading the customer's database on our fairly modest box with the single SATA disk, we also tested select query performance, and while we didn't see a factor of four gain, we certainly saw that read performance is also substantially better. So the fsync issue possibly accounts for part of our factor-of-four, but not all of it. Ie, the SAN is still not doing well by comparison, even allowing for the presumption that it is more honest. One curious thing is that some postgres backends seem to spend an inordinate amount of time in uninterruptible iowait state. I found a posting to this list from December 2004 from someone who reported that very same thing. For example, bringing down postgres on the customer box requires kill -9, because there are invariably one or two processes so deeply uninterruptible as to not respond to a politer signal. That indicates something not quite right, doesn't it? Tim -- --- Tim Allen [EMAIL PROTECTED] Proximity Pty Ltd http://www.proximity.com.au/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Delete operation VERY slow...
Wow! That was almost instantaneous. I can't believe the difference. The only inconvenience is that I need to remove all the foreign key constraints before truncating, then put them back after. But I suppose it is a small price to pay for this incredible optimization. Thank you! On Fri, 2006-06-16 at 12:52 +0530, Gourish Singbal wrote: > > David, > > Truncate table would be a good idea if u want to delete all the data > in the table. > You need not perform vacuum in this case since there are no dead rows > created. > > ~gourish > > > On 6/16/06, David Leangen <[EMAIL PROTECTED]> wrote: > > Hello! > > I am trying to delete an entire table. There are about 41,000 > rows in > the table (based on count(*)). > > I am using the SQL comment: delete from table; > > The operation seems to take in the order of hours, rather than > seconds > or minutes. > > "Explain delete from table" gives me: > > QUERY PLAN > > Seq Scan on table (cost=0.00..3967.74 rows=115374 width=6) > (1 row) > > > I am using an Intel Pentium D 2.8GHz CPU. My system has about > 1.2GB of > RAM. This should be ok... my database isn't that big, I think. > > > Any ideas why this takes so long and how I could speed this > up? > > Or alternatively, is there a better way to delete all the > contents from > a table? > > > Thank you! > > > > ---(end of > broadcast)--- > TIP 5: don't forget to increase your free space map settings > > > > -- > Best, > Gourish Singbal ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] SAN performance mystery
Tim Allen wrote: > We have a customer who are having performance problems. They have a > large (36G+) postgres 8.1.3 database installed on an 8-way opteron with > 8G RAM, attached to an EMC SAN via fibre-channel (I don't have details > of the EMC SAN model, or the type of fibre-channel card at the moment). > They're running RedHat ES3 (which means a 2.4.something Linux kernel). > > They are unhappy about their query performance. We've been doing various > things to try to work out what we can do. One thing that has been > apparent is that autovacuum has not been able to keep the database > sufficiently tamed. A pg_dump/pg_restore cycle reduced the total > database size from 81G to 36G. Performing the restore took about 23 hours. Hi Tim! to give you some comparision - we have a similiar sized database here (~38GB after a fresh restore and ~76GB after some months into production). the server is a 4 core Opteron @2,4Ghz with 16GB RAM, connected via 2 QLogic 2Gbit HBA's to the SAN (IBM DS4300 Turbo). It took us quite a while to get this combination up to speed but a full dump&restore cycle (via a pg_dump | psql pipe over the net) now takes only about an hour. 23 hours or even 5 hours sounds really excessive - I'm wondering about some basic issues with the SAN. If you are using any kind of multipathing (most likely the one in the QLA-drivers) I would at first assume that you are playing ping-pong between the controllers (ie the FC-cards do send IO to more than one SAN-head causing those to failover constantly completely destroying performance). ES3 is rather old too and I don't think that even their hacked up kernel is very good at driving a large Opteron SMP box (2.6 should be MUCH better in that regard). Other than that - how well is your postgresql instance tuned to your hardware ? Stefan ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Delete operation VERY slow...
David, Truncate table would be a good idea if u want to delete all the data in the table. You need not perform vacuum in this case since there are no dead rows created. ~gourish On 6/16/06, David Leangen <[EMAIL PROTECTED]> wrote: Hello!I am trying to delete an entire table. There are about 41,000 rows inthe table (based on count(*)). I am using the SQL comment: delete from table;The operation seems to take in the order of hours, rather than secondsor minutes."Explain delete from table" gives me: QUERY PLAN Seq Scan on table (cost=0.00..3967.74 rows=115374 width=6)(1 row)I am using an Intel Pentium D 2.8GHz CPU. My system has about 1.2GB ofRAM. This should be ok... my database isn't that big, I think.Any ideas why this takes so long and how I could speed this up?Or alternatively, is there a better way to delete all the contents from a table?Thank you!---(end of broadcast)---TIP 5: don't forget to increase your free space map settings -- Best,Gourish Singbal
Re: [PERFORM] Delete operation VERY slow...
am 16.06.2006, um 15:58:46 +0900 mailte David Leangen folgendes: > > Hello! > > I am trying to delete an entire table. There are about 41,000 rows in > the table (based on count(*)). > > I am using the SQL comment: delete from table; Use TRUNCATE table. Andreas -- Andreas Kretschmer(Kontakt: siehe Header) Heynitz: 035242/47215, D1: 0160/7141639 GnuPG-ID 0x3FFF606C http://wwwkeys.de.pgp.net ===Schollglas Unternehmensgruppe=== ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Delete operation VERY slow...
Hello! I am trying to delete an entire table. There are about 41,000 rows in the table (based on count(*)). I am using the SQL comment: delete from table; The operation seems to take in the order of hours, rather than seconds or minutes. "Explain delete from table" gives me: QUERY PLAN Seq Scan on table (cost=0.00..3967.74 rows=115374 width=6) (1 row) I am using an Intel Pentium D 2.8GHz CPU. My system has about 1.2GB of RAM. This should be ok... my database isn't that big, I think. Any ideas why this takes so long and how I could speed this up? Or alternatively, is there a better way to delete all the contents from a table? Thank you! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings