Re: [PERFORM] Postgresql and Software RAID/LVM
John A Meinel wrote: Isn't this actually more of a problem for the meta-data to give out in a hardware situation? I mean, if the card you are using dies, you can't just get another one. With software raid, because the meta-data is on the drives, you can pull it out of that machine, and put it into any machine that has a controller which can read the drives, and a similar kernel, and you are back up and running. Probably true. If you have a similar kernel and hardware and if you can recover the state information, knowing where the state information is stored. Those are some very big "ifs" during a hectic disaster. No, it hedges against *more* than one failure. But you can also do a RAID1 over a RAID5 in software. But if you are honestly willing to create a full RAID1, just create a RAID1 over RAID0. The performance is much better. And since you have a full RAID1, as long as both drives of a pairing don't give out, you can lose half of your drives. True as well. The problem with RAID1 over RAID0 is that, during a drive failure, you are one bad sector from disaster. Further, RAID5 does automatic rebuild, whereas most RAID1 setups do not. RAID5 reduces the amount of time that things are degraded, reducing the time that your data is in danger. If you want the space, but you feel that RAID5 isn't redundant enough, go to RAID6, which uses 2 parity locations, each with a different method of storing parity, so not only is it more redundant, you have a better chance of finding problems. Agreed, RAID6 is the future, but still won't keep the server running when the RAID controller dies, or the SCSI/FC host adapter goes, or you want to upgrade controller firmware, or you want to replace the media, or... So you are saying that you were able to replace the RAID controller without turning off the machine? I realize there does exist hot-swappable PCI cards, but I think you are overstating what you mean by "fully operational". For instance, it's not like you can access your data while it is being physically moved. Detach mirror 1, uncable and move, recable and resync. Detach mirror 2, uncable and move, recable and resync. I do think you had some nice hardware. But I know you can do all of this in software as well. It is usually a price/performance tradeoff. You spend quite a bit to get a hardware RAID card that can keep up with a modern CPU. I know we have an FC raid box at work which has a full 512MB of cache on it, but it wasn't that much cheaper than buying a dedicated server. We run two Nexsan ATABoy2 arrays. These can be found in 1 TB configurations for about $3,000 each, putting mirrored RAID5 storage at $6 per GB. Is that a lot of money for storage? Maybe. In our case, that's dirt cheap protection against storage-related downtime. Marty ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Postgresql and Software RAID/LVM
> Has anyone ran Postgres with software RAID or LVM on a production box? > What have been your experience? Yes, we have run for a couple years Pg with software LVM (mirroring) against two hardware RAID5 arrays. We host a production Sun box that runs 24/7. My experience: * Software RAID (other than mirroring) is a disaster waiting to happen. If the metadata for the RAID set gives out for any reason (CMOS scrambles, card dies, power spike, etc.) then you are hosed beyond belief. In most cases it is almost impossible to recover. With mirroring, however, you can always boot and operate on a single mirror, pretending that no LVM/RAID is underway. In other words, each mirror is a fully functional copy of the data which will operate your server. * Hardware RAID5 is a terrific way to boost performance via write caching and spreading I/O across multiple spindles. Each of our external arrays operates 14 drives (12 data, 1 parity and 1 hot spare). While RAID5 protects against single spindle failure, it will not hedge against multiple failures in a short time period, SCSI contoller failure, SCSI cable problems or even wholesale failure of the RAID controller. All of these things happen in a 24/7 operation. Using software RAID1 against the hardware RAID5 arrays hedges against any single failure. * Software mirroring gives you tremendous ability to change the system while it is running, by taking offline the mirror you wish to change and then synchronizing it after the change. On a fully operational production server, we have: * restriped the RAID5 array * replaced all RAID5 media with higher capacity drives * upgraded RAID5 controller * moved all data from an old RAID5 array to a newer one * replaced host SCSI controller * uncabled and physically moved storage to a different part of data center Again, all of this has taken place (over the years) while our machine was fully operational. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] PostgreSQL clustering VS MySQL clustering
> Tell me if I am wrong but it sounds to me like like > an endless problem Agreed. Such it is with caching. After doing some informal benchmarking with 8.0 under Solaris, I am convinced that our major choke point is WAL synchronization, at least for applications with a high commit rate. We have noticed a substantial improvement in performance with 8.0 vs 7.4.6. All of the update/insert problems seem to have gone away, save WAL syncing. I may have to take back what I said about indexes. Olivier Sirven wrote: Le Vendredi 21 Janvier 2005 19:18, Marty Scholes a écrit : The indexes can be put on a RAM disk tablespace and that's the end of index problems -- just make sure you have enough memory available. Also make sure that the machine can restart correctly after a crash: the tablespace is dropped and recreated, along with the indexes. This will cause a machine restart to take some time. Tell me if I am wrong but it sounds to me like like an endless problemThis solution may work with small indexes (less than 4GB) but what appends when the indexes grow ? You would add more memory to your server ? But there will be a moment were you can not add more so what's next ? ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] PostgreSQL clustering VS MySQL clustering
Tatsuo, I agree completely that vacuum falls apart on huge tables. We could probably do the math and figure out what the ratio of updated rows per total rows is each day, but on a constantly growing table, that ratio gets smaller and smaller, making the impact of dead tuples in the table proportionately less and less. If multi-version indexes are handled the same way as table rows, then the indexes will also suffer the same fate, if not worse. For huge tables, the b-tree depth can get fairly large. When a b-tree is of depth X and the machine holds the first Y levels of the b-tree in memory, then each table row selected requires a MINIMUM of (X-Y) disk access *before* the table row is accessed. Substitute any numbers you want for X and Y, but you will find that huge tables require many index reads. Index updates are even worse. A table row update requires only a copy of the row. An index update requires at least a copy of the leaf node, and possibly more nodes if nodes must be split or collapsed. These splits and collapses can cascade, causing many nodes to be affected. This whole process takes place for each and every index affected by the change, which is every index on the table when a row is added or deleted. All of this monkeying around takes place above and beyond the simple change of the row data. Further, each and every affected index page is dumped to WAL. Assuming the indexes have the same MVCC proprties of row data, then the indexes would get dead tuples at a rate far higher than that of the table data. So yes, vacuuming is a problem on large tables. It is a bigger problem for indexes. On large tables, index I/O comprises most of the I/O mix. Don't take my word for it. Run a benchmark on Pg. Then, soft-link the index files and the WAL directories to a RAM disk. Rerun the benchmark and you will find that Pg far faster, much faster than if only the data were on the RAM disk. Marty Tatsuo Ishii wrote: IMO the bottle neck is not WAL but table/index bloat. Lots of updates on large tables will produce lots of dead tuples. Problem is, There' is no effective way to reuse these dead tuples since VACUUM on huge tables takes longer time. 8.0 adds new vacuum delay paramters. Unfortunately this does not help. It just make the execution time of VACUUM longer, that means more and more dead tuples are being made while updating. Probably VACUUM works well for small to medium size tables, but not for huge ones. I'm considering about to implement "on the spot salvaging dead tuples". -- Tatsuo Ishii This is probably a lot easier than you would think. You say that your DB will have lots of data, lots of updates and lots of reads. Very likely the disk bottleneck is mostly index reads and writes, with some critical WAL fsync() calls. In the grand scheme of things, the actual data is likely not accessed very often. The indexes can be put on a RAM disk tablespace and that's the end of index problems -- just make sure you have enough memory available. Also make sure that the machine can restart correctly after a crash: the tablespace is dropped and recreated, along with the indexes. This will cause a machine restart to take some time. After that, if the WAL fsync() calls are becoming a problem, put the WAL files on a fast RAID array, etiher a card or external enclosure, that has a good amount of battery-backed write cache. This way, the WAL fsync() calls will flush quickly to the RAM and Pg can move on while the RAID controller worries about putting the data to disk. With WAL, low access time is usually more important than total throughput. The truth is that you could have this running for not much money. Good Luck, Marty Le Jeudi 20 Janvier 2005 19:09, Bruno Almeida do Lago a écrit : > Could you explain us what do you have in mind for that solution? I mean, > forget the PostgreSQL (or any other database) restrictions and explain us > how this hardware would be. Where the data would be stored? > > I've something in mind for you, but first I need to understand your needs! I just want to make a big database as explained in my first mail ... At the beginning we will have aprox. 150 000 000 records ... each month we will add about 4/8 millions new rows in constant flow during the day ... and in same time web users will access to the database in order to read those data. Stored data are quite close to data stored by google ... (we are not making a google clone ... just a lot of data many small values and some big ones ... that's why I'm comparing with google for data storage). Then we will have a search engine searching into those data ... Dealing about the hardware, for the moment we have only a bi-pentium Xeon 2.8Ghz with 4 Gb of RAM ... and we saw we had bad performance results ... so we are thinking about a new solution with maybe several servers (server design may vary from one to other) ... to get a kind of cluster to get better performance ...
Re: [PERFORM] PostgreSQL vs. Oracle vs. Microsoft
Randolf, You probably won't want to hear this, but this decision likely has nothing to do with brands, models, performance or applications. You are up against a pro salesman who is likely very good at what he does. Instead spewing all sorts of "facts" and statistics to your client, the salesman is probably trying to figure out what is driving your client. Do you know what is driving your client? Why does he want to switch? Why now? Why not next quarter? Why not last quarter? Why does he want to do the application at all? Forget the expected answers, e.g., "We need this application to enhance our competitiveness in the marketplace and increase the blah blah blah." Why does YOUR CLIENT actually care about any of this? Is he trying to impress his boss? Build his career? Demonstrate that he can manage a significant project? Is he trying to get rid of old code from an ex-coworker that he hated? Is it spite? Pride? Is he angling for a bigger budget next year? Is there someone who will be assigned to this project that your client wants to lord over? The list goes on and on, and there is no way that your client is going to admit the truth and say something like, "The real reason I want to do this right now is that my childhood rival at XYZ corp just did a project like this. I need to boost my ego, so I *MUST* do a bigger project, right now." You gotta read between the lines. How important is this and why? How urgent and why? Who all is behind this project? What are each individual's personal motivations? Does anyone resent a leader on the team and secretly wish for this project to fail? Once you know what is actually going on in people's heads, you can begin to build rapport and influence them. You can establish your own credibility and safety with your solution, while planting seeds of doubt about another solution. At its core, this decision is (very likely) not at all about RDBMS performance or anything else related to computing. Have you asked yourself why you care about one solution over another? What's driving you to push Pg over MS? Why? You might want to start answering those questions before you even talk to your client. Good Luck, Marty Randolf Richardson wrote: I'm looking for recent performance statistics on PostgreSQL vs. Oracle vs. Microsoft SQL Server. Recently someone has been trying to convince my client to switch from SyBASE to Microsoft SQL Server (they originally wanted to go with Oracle but have since fallen in love with Microsoft). All this time I've been recommending PostgreSQL for cost and stability (my own testing has shown it to be better at handling abnormal shutdowns and using fewer system resources) in addition to true cross-platform compatibility. If I can show my client some statistics that PostgreSQL outperforms these (I'm more concerned about it beating Oracle because I know that Microsoft's stuff is always slower, but I need the information anyway to protect my client from falling victim to a 'sales job'), then PostgreSQL will be the solution of choice as the client has always believed that they need a high-performance solution. I've already convinced them on the usual price, cross-platform compatibility, open source, long history, etc. points, and I've been assured that if the performance is the same or better than Oracle's and Microsoft's solutions that PostgreSQL is what they'll choose. Thanks in advance. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] PostgreSQL clustering VS MySQL clustering
This is probably a lot easier than you would think. You say that your DB will have lots of data, lots of updates and lots of reads. Very likely the disk bottleneck is mostly index reads and writes, with some critical WAL fsync() calls. In the grand scheme of things, the actual data is likely not accessed very often. The indexes can be put on a RAM disk tablespace and that's the end of index problems -- just make sure you have enough memory available. Also make sure that the machine can restart correctly after a crash: the tablespace is dropped and recreated, along with the indexes. This will cause a machine restart to take some time. After that, if the WAL fsync() calls are becoming a problem, put the WAL files on a fast RAID array, etiher a card or external enclosure, that has a good amount of battery-backed write cache. This way, the WAL fsync() calls will flush quickly to the RAM and Pg can move on while the RAID controller worries about putting the data to disk. With WAL, low access time is usually more important than total throughput. The truth is that you could have this running for not much money. Good Luck, Marty Le Jeudi 20 Janvier 2005 19:09, Bruno Almeida do Lago a écrit : > Could you explain us what do you have in mind for that solution? I mean, > forget the PostgreSQL (or any other database) restrictions and explain us > how this hardware would be. Where the data would be stored? > > I've something in mind for you, but first I need to understand your needs! I just want to make a big database as explained in my first mail ... At the beginning we will have aprox. 150 000 000 records ... each month we will add about 4/8 millions new rows in constant flow during the day ... and in same time web users will access to the database in order to read those data. Stored data are quite close to data stored by google ... (we are not making a google clone ... just a lot of data many small values and some big ones ... that's why I'm comparing with google for data storage). Then we will have a search engine searching into those data ... Dealing about the hardware, for the moment we have only a bi-pentium Xeon 2.8Ghz with 4 Gb of RAM ... and we saw we had bad performance results ... so we are thinking about a new solution with maybe several servers (server design may vary from one to other) ... to get a kind of cluster to get better performance ... Am I clear ? Regards, ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[PERFORM] Disk performance, was Re: tablespaces and DB administration
This was a lively debate on what was faster, single spindles or RAID. This is important, because I keep running into people who do not understand the performance dynamics of a RDBMS like Oracle or Pg. Pg and Oracle make a zillion tiny reads and writes and fsync() regularly. If your drive will copy a 4GB file at a sustained rate of 72 MB/s, that tells you nothing about how it will do with an RDBMS. I will throw in my experience on RAID vs spindles. With the RAID write cache disabled, a well balanced set of spindles will kill a RAID system any day. Enable the cache, and the RAID starts inching ahead. My experience is that no one can continuously keep I/O properly balanced across several spindles on a production system. Things change and the I/O mix changes. Then, the RAID is outperforming the spindles. If you want to spend the rest of your career constantly balancing I/O across spindles, then do so. For the rest of us, with a write cache, a hardware RAID wins hands down over the long haul. It might make sense to provide some sort of benchmarking tool for various systems so that we can predict I/O performance. Run the following code both on a hardware RAID and on a single spindle. #include #include #include #include #include #include void makefile(int fs) { int i; charbuf[8192]; int ld; int blocks=4096; int pos; time_t stim; time_t etim; float avg; unlink("dump.out"); ld=open("dump.out", O_WRONLY | O_CREAT); printf("Writing %d blocks sequentially\n", blocks); time(&stim); for (i=0; i // purge the write cache fsync(ld); printf("Writing %d blocks (somewhat randomly)\n", blocks); time(&stim); for (i=0; i close(ld); unlink("dump.out"); } int main() { printf("No fsync()\n"); makefile(0); printf("With fsync()\n"); makefile(1); return 0; } The first operation shows how well the OS write cache is doing. The second shows how poorly everything runs with fsync(), which is what Pg and Oracle do. My RAID produced the following, but was also running production when I ran it: No fsync() Writing 4096 blocks sequentially Took 1 seconds, avg 4096.00 iops Writing 4096 blocks (somewhat randomly) Took 4 seconds, avg 1024.00 iops With fsync() Writing 4096 blocks sequentially Took 40 seconds, avg 102.42 iops Writing 4096 blocks (somewhat randomly) Took 66 seconds, avg 62.060608 iops When I ran this on a decent fibre channel drive, I got: No fsync() Writing 4096 blocks sequentially Took 1 seconds, avg 4096.00 iops Writing 4096 blocks (somewhat randomly) Took 7 seconds, avg 585.142883 iops With fsync() Writing 4096 blocks sequentially Took 106 seconds, avg 38.641510 iops Writing 4096 blocks (somewhat randomly) Took 115 seconds, avg 35.617390 iops You can see that the RAID array really helps out with small writes. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] PostgreSQL caching
Vitaly, This looks like there might be some room for performance improvement... > MS> I didn't see the table structure, but I assume > MS> that the vote_avg and > MS> vote_count fields are in bv_bookgenres. > > I didn't understand you. vote_avg is stored in bv_books. Ok. That helps. The confusion (on my end) came from the SELECT clause of the query you provided: > SELECT bv_books. * , >vote_avg, >vote_count All fields from bv_books were selected (bv_books.*) along with vote_agv and vote_count. My assumption was that vote_avg and vote_count were therefore not in bv_books. At any rate, a query with an IN clause should help quite a bit: SELECT bv_books. * FROM bv_books WHERE bv_books.book_id IN ( SELECT book_id FROM bv_genres WHERE bv_bookgenres.genre_id = 5830 ) ORDER BY vote_avg DESC LIMIT 10 OFFSET 0; Give it a whirl. Marty ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PostgreSQL caching
> Hello Marty, > > MS> Is that a composite index? > > It is a regular btree index. What is a composite index? My apologies. A composite index is one that consists of multiple fields (aka multicolumn index). The reason I ask is that it was spending almost half the time just searching bv_bookgenres, which seemed odd. I may be speaking out of turn since I am not overly familiar with Pg's quirks and internals. A composite index, or any index of a large field, will lower the number of index items stored per btree node, thereby lowering the branching factor and increasing the tree depth. On tables with many rows, this can result in many more disk accesses for reading the index. An index btree that is 6 levels deep will require at least seven disk accesses (6 for the index, one for the table row) per row retrieved. Not knowing the structure of the indexes, it's hard to say too much about it. The fact that a 1993 row select from an indexed table took 3.5 seconds caused me to take notice. > MS> I would be curious to see how it performs with an "IN" clause, > MS> which I would suspect would go quite a bit fasrer. > > Actually it reached 20s before I canceled it... Here's the explain: I believe that. The code I posted had a nasty join bug. If my math is right, the query was trying to return 1993*1993, or just under 4 million rows. I didn't see the table structure, but I assume that the vote_avg and vote_count fields are in bv_bookgenres. If no fields are actually needed from bv_bookgenres, then the query might be constructed in a way that only the index would be read, without loading any row data. I think that you mentioned this was for a web app. Do you actually have a web page that displays 2000 rows of data? Good luck, Marty ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] PostgreSQL caching
Not knowing a whole lot about the internals of Pg, one thing jumped out at me, that each trip to get data from bv_books took 2.137 ms, which came to over 4.2 seconds right there. The problem "seems" to be the 1993 times that the nested loop spins, as almost all of the time is spent there. Personally, I am amazed that it takes 3.585 seconds to index scan i_bookgenres_genre_id. Is that a composite index? Analyzing the taables may help, as the optimizer appears to mispredict the number of rows returned. I would be curious to see how it performs with an "IN" clause, which I would suspect would go quite a bit fasrer. Try the following: SELECT bv_books. * , vote_avg, vote_count FROM bv_bookgenres, bv_books WHERE bv_books.book_id IN ( SELECT book_id FROM bv_genres WHERE bv_bookgenres.genre_id = 5830 ) AND bv_bookgenres.genre_id = 5830 ORDER BY vote_avg DESC LIMIT 10 OFFSET 0; In this query, all of the book_id values are pulled at once. Who knows? If you get statisctics on this, please post. Marty ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Hardware Platform
Duane wrote: > P.S. I've only just begun using PostgreSQL after having > used (and still using) DB2 on a mainframe for the past 14 > years. My experience with Unix/Linux is limited to some > community college classes I've taken but we do have > a couple of experienced Linux sysadmins on our team. > I tell you this because my "ignorance" will probably > show more than once in my inquiries. Duane, If you've been actively using and developing in DB2, presumably under MVS or whatever big blue is calling it these days, for 14 years, then you will bring a wealth of big system expertise to Pg. Please stay involved and make suggestions where you thing Pg could be improved. Marty ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Quad processor options
After reading the replies to this, it is clear that this is a Lintel-centric question, but I will throw in my experience. > I am curious if there are any real life production > quad processor setups running postgresql out there. Yes. We are running a 24/7 operation on a quad CPU Sun V880. > Since postgresql lacks a proper replication/cluster > solution, we have to buy a bigger machine. This was a compelling reason for us to stick with SPARC and avoid Intel/AMD when picking a DB server. We moved off of an IBM mainframe in 1993 to Sun gear and never looked back. We can upgrade to our heart's content with minimal disruption and are only on our third box in 11 years with plenty of life left in our current one. > Right now we are running on a dual 2.4 Xeon, 3 GB Ram > and U160 SCSI hardware-raid 10. A couple people mentioned hardware RAID, which I completely agree with. I prefer an external box with a SCSI or FC connector. There are no driver issues that way. We boot from our arrays. The Nexsan ATABoy2 is a nice blend of performance, reliability and cost. Some of these with 1TB and 2TB of space were recently spotted on ebay for under $5k. We run a VERY random i/o mix on ours and it will consistently sustain 15 MB/s in blended read and write i/o, sustaining well over 1200 io/s. These are IDE drives, so they fail more often than SCSI, so run RAID1 or RAID5. The cache on these pretty much eliminates the RAID5 penalties. > The 30k+ setups from Dell etc. don't fit our budget. For that kind of money you could get a lower end Sun box (or IBM RS/6000 I would imagine) and give yourself an astounding amount of headroom for future growth. Sincerely, Marty ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Scaling further up
I have some suggestions based on my anecdotal experience. 1. This is a relatively small DB -- the working set will likely be in RAM at any moment in time, making read I/O time mostly irrelevant. 2. The killer will be write times -- specifically log writes. Small and heavily synchronized writes, log and data writes, will drag down an impressive hardware RAID setup. We run mirrored hardware RAID 5 arrays with write back cache and are constantly seeking ways to improve write performance. We do a lot of batch processing, though, so we do a lot of write I/Os. 3. Be very careful with "battery backed write cache." It usually works as advertised. More than once in the past decade I have seen spontaneous cache corruption after power losss. The corruption usually happens when some admin, including me, has assumed that the cache will ALWAYS survive a power failure unblemished and has no "plan B." Make sure you have a contingency plan for corruption, or don't enable the cache. 4. RAID 10 will likely have bigger stripe sizes on the RAID 0 portion of the setup, and might hinder, not help small write I/O performance. 5. Most (almost all) of the I/O time will be due to the access time (head seek + head settle + rotational latency) and very little of the I/O time will due to data transfer time. In other words, getting drives that provide faster transfer rates will barely improve performance. The secret is lowering the access time. 6. A relatively cheap way to drastically drop the access time is to get large drive(s) and only use a portion of them for storage. The less space used on the drive, the less area the heads need to cover for seeks. At one extreme, you could make the partition the size of a single cylinder. This would make access time (ignoring OS and controller overhead) identical to rotational latency, which is as low as 4.2 ms for a cheap 7200 RPM drive. 7. A drive with a 5 ms average service time, servicing 8 KB blocks, will yield as much as 1.6 MB/s sustained write throughput. Not bad for a cheap uncached solution. Any OS aggregation of writes during the fsync() call will further improve this number -- it is basically a lower bound for throughput. 8. Many people, especially managers, cannot stomach buying disk space and only using a portion of it. In many cases, it seems more palatable to purchase a much more expensive solution to get to the same speeds. Good luck. scott.marlowe wrote: On Wed, 3 Mar 2004, Paul Thomas wrote: > > On 02/03/2004 23:25 johnn wrote: > > [snip] > > random_page_cost should be set with the following things taken into > > account: > > - seek speed > > Which is not exactly the same thing as spindle speed as it's a combination > of spindle speed and track-to-track speed. I think you'll find that a 15K > rpm disk, whilst it will probably have a lower seek time than a 10K rpm > disk, won't have a proportionately (i.e., 2/3rds) lower seek time. There are three factors that affect how fast you can get to the next sector: seek time settle time rotational latency Most drives only list the first, and don't bother to mention the other two. On many modern drives, the seek times are around 5 to 10 milliseconds. The settle time varies as well. the longer the seek, the longer the settle, generally. This is the time it takes for the head to stop shaking and rest quietly over a particular track. Rotational Latency is the amount of time you have to wait, on average, for the sector you want to come under the heads. Assuming an 8 ms seek, and 2 ms settle (typical numbers), and that the rotational latency on average is 1/2 of a rotation: At 10k rpm, a rotation takes 1/166.667 of a second, or 6 mS. So, a half a rotation is approximately 3 mS. By going to a 15k rpm drive, the latency drops to 2 mS. So, if we add them up, on the same basic drive, one being 10k and one being 15k, we get: 10krpm: 8+2+3 = 13 mS 15krpm: 8+2+2 = 12 mS So, based on the decrease in rotational latency being the only advantage the 15krpm drive has over the 10krpm drive, we get an decrease in access time of only 1 mS, or only about an 8% decrease in actual seek time. So, if you're random page cost on 10krpm drives was 1.7, you'd need to drop it to 1.57 or so to reflect the speed increase from 15krpm drives. I.e. it's much more likely that going from 1 gig to 2 gigs of ram will make a noticeable difference than going from 10k to 15k drives. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] rapid degradation after postmaster restart
Six days ago I installed Pg 7.4.1 on Sparc Solaris 8 also. I am hopeful that we as well can migrate a bunch of our apps from Oracle. After doing some informal benchmarks and performance testing for the past week I am becoming more and more impressed with what I see. I have seen similar results to what you are describing. I found that running a full vacuum: vacuumdb -fza followed by a checkpoint makes it run fast again. Try timing the update with and without a full vacuum. I can't help but wonder if a clean shutdown includes some vacuuming. Obviously, in a production database this would be an issue. Please post back what you learn. Sincerely, Marty I have been doing a bunch of informat Joe Conway wrote: I'm trying to troubleshoot a performance issue on an application ported from Oracle to postgres. Now, I know the best way to get help is to post the schema, explain analyze output, etc, etc -- unfortunately I can't do that at the moment. However, maybe someone can point me in the right direction to figure this out on my own. That said, here are a few details... PostgreSQL 7.4.1 bash-2.03$ uname -a SunOS col65 5.8 Generic_108528-27 sun4u sparc SUNW,Sun-Fire-280R The problem is this: the application runs an insert, that fires off a trigger, that cascades into a fairly complex series of functions, that do a bunch of calculations, inserts, updates, and deletes. Immediately after a postmaster restart, the first insert or two take about 1.5 minutes (undoubtedly this could be improved, but it isn't the main issue). However by the second or third insert, the time increases to 7 - 9 minutes. Restarting the postmaster causes the cycle to repeat, i.e. the first one or two inserts are back to the 1.5 minute range. Any ideas spring to mind? I don't have much experience with Postgres on Solaris -- could it be related to that somehow? Thanks for any insights. Joe ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])