Re: [PERFORM] Testing Sandforce SSD
Yeb Havinga wrote: Writes/s start low but quickly converge to a number in the range of 1200 to 1800. The writes diskchecker does are 16kB writes. Making this 4kB writes does not increase writes/s. 32kB seems a little less, 64kB is about two third of initial writes/s and 128kB is half. Let's turn that into MB/s numbers: 4k * 1200 = 4.7 MB/s 8k * 1200 = 9.4 MB/s 16k * 1200 = 18.75 MB/s 64kb * 1200 * 2/3 [800] = 37.5 MB/s 128kb * 1200 / 2 [600] = 75 MB/s For comparison sake, a 7200 RPM drive running PostgreSQL will do <120 commits/second without a BBWC, so at an 8K block size that's <1 MB/s. If you put a cache in the middle, I'm used to seeing about 5000 8K commits/second, which is around 40 MB/s. So this is sitting right in the middle of those two. Sequential writes with a commit after each one like this are basically the worst case for the SSD, so if it can provide reasonable performance on that I'd be happy. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Testing Sandforce SSD
Yeb Havinga wrote: Yeb Havinga wrote: diskchecker: running 37 sec, 4.47% coverage of 500 MB (1468 writes; 39/s) Total errors: 0 :-) OTOH, I now notice the 39 write /s .. If that means ~ 39 tps... bummer. When playing with it a bit more, I couldn't get the test_file to be created in the right place on the test system. It turns out I had the diskchecker config switched and 39 write/s was the speed of the not-rebooted system, sorry. I did several diskchecker.pl tests this time with the testfile on the SSD, none of the tests have returned an error :-) Writes/s start low but quickly converge to a number in the range of 1200 to 1800. The writes diskchecker does are 16kB writes. Making this 4kB writes does not increase writes/s. 32kB seems a little less, 64kB is about two third of initial writes/s and 128kB is half. So no BBU speeds here for writes, but still ~ factor 10 improvement of iops for a rotating SATA disk. regards, Yeb Havinga PS: hdparm showed write cache was on. I did tests with both ext2 and xfs, where xfs tests I did with both barrier and nobarrier. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Testing Sandforce SSD
Joshua D. Drake wrote: That is quite the toy. I can get 4 SATA-II with RAID Controller, with battery backed cache, for the same price or less :P True, but if you look at tests like http://www.anandtech.com/show/2899/12 it suggests there's probably at least a 6:1 performance speedup for workloads with a lot of random I/O to them. And I'm really getting sick of the power/noise/heat that the 6 drives in my home server produces. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Testing Sandforce SSD
On Sat, 2010-07-24 at 16:21 -0400, Greg Smith wrote: > Greg Smith wrote: > > Note that not all of the Sandforce drives include a capacitor; I hope > > you got one that does! I wasn't aware any of the SF drives with a > > capacitor on them were even shipping yet, all of the ones I'd seen > > were the chipset that doesn't include one still. Haven't checked in a > > few weeks though. > > Answer my own question here: the drive Yeb got was the brand spanking > new OCZ Vertex 2 Pro, selling for $649 at Newegg for example: > http://www.newegg.com/Product/Product.aspx?Item=N82E16820227535 and with > the supercacitor listed right in the main production specifications > there. This is officially the first inexpensive (relatively) SSD with a > battery-backed write cache built into it. If Yeb's test results prove > it works as it's supposed to under PostgreSQL, I'll be happy to finally > have a moderately priced SSD I can recommend to people for database > use. And I fear I'll be out of excuses to avoid buying one as a toy for > my home system. That is quite the toy. I can get 4 SATA-II with RAID Controller, with battery backed cache, for the same price or less :P Sincerely, Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Testing Sandforce SSD
Greg Smith wrote: Note that not all of the Sandforce drives include a capacitor; I hope you got one that does! I wasn't aware any of the SF drives with a capacitor on them were even shipping yet, all of the ones I'd seen were the chipset that doesn't include one still. Haven't checked in a few weeks though. Answer my own question here: the drive Yeb got was the brand spanking new OCZ Vertex 2 Pro, selling for $649 at Newegg for example: http://www.newegg.com/Product/Product.aspx?Item=N82E16820227535 and with the supercacitor listed right in the main production specifications there. This is officially the first inexpensive (relatively) SSD with a battery-backed write cache built into it. If Yeb's test results prove it works as it's supposed to under PostgreSQL, I'll be happy to finally have a moderately priced SSD I can recommend to people for database use. And I fear I'll be out of excuses to avoid buying one as a toy for my home system. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Testing Sandforce SSD
Yeb Havinga wrote: diskchecker: running 37 sec, 4.47% coverage of 500 MB (1468 writes; 39/s) Total errors: 0 :-) OTOH, I now notice the 39 write /s .. If that means ~ 39 tps... bummer. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Testing Sandforce SSD
Greg Smith wrote: Note that not all of the Sandforce drives include a capacitor; I hope you got one that does! I wasn't aware any of the SF drives with a capacitor on them were even shipping yet, all of the ones I'd seen were the chipset that doesn't include one still. Haven't checked in a few weeks though. I think I did, it was expensive enough, though while ordering its very easy to order the wrong one, all names on the product category page look the same. (OCZ Vertex 2 Pro) * How to test for power failure? I've had good results using one of the early programs used to investigate this class of problems: http://brad.livejournal.com/2116715.html?page=2 A great tool, thanks for the link! diskchecker: running 34 sec, 4.10% coverage of 500 MB (1342 writes; 39/s) diskchecker: running 35 sec, 4.24% coverage of 500 MB (1390 writes; 39/s) diskchecker: running 36 sec, 4.35% coverage of 500 MB (1427 writes; 39/s) diskchecker: running 37 sec, 4.47% coverage of 500 MB (1468 writes; 39/s) didn't get 'ok' from server (11387 316950), msg=[] = Connection reset by peer at ./diskchecker.pl line 132. here's where I removed the power and left it off for about a minute. Then started again then did the verify y...@a:~$ ./diskchecker.pl -s client45.eemnes verify test_file verifying: 0.00% Total errors: 0 :-) this was on ext2 * What filesystem to use on the SSD? To minimize writes and maximize chance for seeing errors I'd choose ext2 here. I don't consider there to be any reason to deploy any part of a PostgreSQL database on ext2. The potential for downtime if the fsck doesn't happen automatically far outweighs the minimal performance advantage you'll actually see in real applications. Hmm.. wouldn't that apply for other filesystems as well? I know that JFS also won't mount if booted unclean, it somehow needs a marker from the fsck. Don't know for ext3, xfs etc. All of the benchmarks showing large gains for ext2 over ext3 I have seen been synthetic, not real database performance; the internal ones I've run using things like pgbench do not show a significant improvement. (Yes, I'm already working on finding time to publicly release those findings) The reason I'd choose ext2 on the SSD was mainly to decrease the number of writes, not for performance. Maybe I should ultimately do tests for both journalled and ext2 filesystems and compare the amount of data per x pgbench transactions. Put it on ext3, toggle on noatime, and move on to testing. The overhead of the metadata writes is the least of the problems when doing write-heavy stuff on Linux. Will surely do and post the results. thanks, Yeb Havinga -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Testing Sandforce SSD
On Sat, Jul 24, 2010 at 3:20 AM, Yeb Havinga wrote: > Hello list, > > Probably like many other's I've wondered why no SSD manufacturer puts a > small BBU on a SSD drive. Triggered by Greg Smith's mail > http://archives.postgresql.org/pgsql-performance/2010-02/msg00291.php here, > and also anandtech's review at http://www.anandtech.com/show/2899/1 (see > page 6 for pictures of the capacitor) I ordered a SandForce drive and this > week it finally arrived. > > And now I have to test it and was wondering about some things like > > * How to test for power failure? I test like this: write a small program that sends a endless series of inserts like this: *) on the server: create table foo (id serial); *) from the client: insert into foo default values returning id; on the client side print the inserted value to the terminal after the query is reported as complete to the client. Run the program, wait a bit, then pull the plug on the server. The database should recover clean and the last reported insert on the client should be there when it restarts. Try restarting immediately a few times then if that works try it and let it simmer overnight. If it makes it at least 24-48 hours that's a very promising sign. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Testing Sandforce SSD
Yeb Havinga wrote: Probably like many other's I've wondered why no SSD manufacturer puts a small BBU on a SSD drive. Triggered by Greg Smith's mail http://archives.postgresql.org/pgsql-performance/2010-02/msg00291.php here, and also anandtech's review at http://www.anandtech.com/show/2899/1 (see page 6 for pictures of the capacitor) I ordered a SandForce drive and this week it finally arrived. Note that not all of the Sandforce drives include a capacitor; I hope you got one that does! I wasn't aware any of the SF drives with a capacitor on them were even shipping yet, all of the ones I'd seen were the chipset that doesn't include one still. Haven't checked in a few weeks though. * How to test for power failure? I've had good results using one of the early programs used to investigate this class of problems: http://brad.livejournal.com/2116715.html?page=2 You really need a second "witness" server to do this sort of thing reliably, which that provides. * What filesystem to use on the SSD? To minimize writes and maximize chance for seeing errors I'd choose ext2 here. I don't consider there to be any reason to deploy any part of a PostgreSQL database on ext2. The potential for downtime if the fsck doesn't happen automatically far outweighs the minimal performance advantage you'll actually see in real applications. All of the benchmarks showing large gains for ext2 over ext3 I have seen been synthetic, not real database performance; the internal ones I've run using things like pgbench do not show a significant improvement. (Yes, I'm already working on finding time to publicly release those findings) Put it on ext3, toggle on noatime, and move on to testing. The overhead of the metadata writes is the least of the problems when doing write-heavy stuff on Linux. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Testing Sandforce SSD
On Jul 24, 2010, at 12:20 AM, Yeb Havinga wrote: > The problem in this scenario is that even when the SSD would show not data > loss and the rotating disk would for a few times, a dozen tests without > failure isn't actually proof that the drive can write it's complete buffer to > disk after power failure. Yes, this is always going to be the case with testing like this - you'll never be able to prove that it will always be safe. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Using more tha one index per table
On 7/24/10 5:57 AM, Torsten Zühlsdorff wrote: Craig James schrieb: The problem is that Google ranks pages based on inbound links, so older versions of Postgres *always* come up before the latest version in page ranking. Since 2009 you can deal with this by defining the canonical-version. (http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html) This is a really cool feature, but it's not what we need. The "canonical" refers to the URL, not the web page. It's only supposed to be used if you have multiple URLs that are actually the *same* page; the "canonical" URL tells Google "use only this URL for this page." But in our case, the Postgres manuals for each release have different URLs *and* different content, so the "canonical URL" isn't the right solution. This is true, but the content is allowed to change "a little". Of course their is no percentage of allowed changes. But it can be quite much. I've used this feature for some clients, which push their content into very different websites and it does work. Most of the content of the documentation doesn't change much between the releases. In most cases the canonical will work the way i suggest. In case of big changes even the recommandation of using a "current" version won't work. Its true that Google ranks pages based on inbound links. But there are more than 200 other factores, which influence the rankings. Most people do not know, that changing most of a sites content makes the inbound links for a long time useless. After big changes in the documentation the "current" entry will be droped for some monthes and the old entries will appear. But note, that every single site of the documentation is ranked for itself. From my experience i would expect the canonical-version with better results, than the current-version. But the canonical is not the best solution in my opinion. I often edit the urls of some documentations, because i need it for a special postgresql version. The documentation clearly misses a version-switch. Combined with an big note, that the current displayed documentation is not the one of the current postgresql-version, this will be the best compromiss in my opinion. Here's an idea: Use a "current" URL, plus a JavaScript embedded in every page that compares its own URL to the "current" URL and, if it doesn't match, does a "document.write()" indicating how to find the most-current version. That would solve three problems: 1. There would be a "current" version that people could link to. 2. If someone found an old version, they would know it and could instantly be directed to the current version. 3. It wouldn't be any burden on the web site maintainers, because the JavaScript wouldn't have to be changed. Craig -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Using more tha one index per table
Craig James schrieb: The problem is that Google ranks pages based on inbound links, so older versions of Postgres *always* come up before the latest version in page ranking. Since 2009 you can deal with this by defining the canonical-version. (http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html) This is a really cool feature, but it's not what we need. The "canonical" refers to the URL, not the web page. It's only supposed to be used if you have multiple URLs that are actually the *same* page; the "canonical" URL tells Google "use only this URL for this page." But in our case, the Postgres manuals for each release have different URLs *and* different content, so the "canonical URL" isn't the right solution. This is true, but the content is allowed to change "a little". Of course their is no percentage of allowed changes. But it can be quite much. I've used this feature for some clients, which push their content into very different websites and it does work. Most of the content of the documentation doesn't change much between the releases. In most cases the canonical will work the way i suggest. In case of big changes even the recommandation of using a "current" version won't work. Its true that Google ranks pages based on inbound links. But there are more than 200 other factores, which influence the rankings. Most people do not know, that changing most of a sites content makes the inbound links for a long time useless. After big changes in the documentation the "current" entry will be droped for some monthes and the old entries will appear. But note, that every single site of the documentation is ranked for itself. From my experience i would expect the canonical-version with better results, than the current-version. But the canonical is not the best solution in my opinion. I often edit the urls of some documentations, because i need it for a special postgresql version. The documentation clearly misses a version-switch. Combined with an big note, that the current displayed documentation is not the one of the current postgresql-version, this will be the best compromiss in my opinion. Greetings from Germany, Torsten -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Testing Sandforce SSD
On Sat, 24 Jul 2010, David Boreham wrote: Do you guys have any more ideas to properly 'feel this disk at its teeth' ? While an 'end-to-end' test using PG is fine, I think it would be easier to determine if the drive is behaving correctly by using a simple test program that emulates the storage semantics the WAL expects. Have it write a constant stream of records, fsync'ing after each write. Record the highest record number flushed so far in some place that won't be lost with the drive under test (e.g. send it over the network to another machine). Kill the power, bring the system back up again and examine what's at the tail end of that file. I think this will give you the worst case test with the easiest result discrimination. If you want to you could add concurrent random writes to another file for extra realism. Someone here may already have a suitable test program. I know I've written several over the years in order to test I/O performance, prove the existence of kernel bugs, and so on. I doubt it matters much how long the power is turned of. A second should be plenty time to flush pending writes if the drive is going to do so. remember that SATA is designed to be hot-plugged, so you don't have to kill the entire system to kill power to the drive. this is a little more ubrupt than the system loosing power, but in terms of loosing data this is about the worst case (while at the same time, it eliminates the possibility that the OS continues to perform writes to the drive as power dies, which is a completely different class of problems, independant of the drive type) David Lang -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Testing Sandforce SSD
Do you guys have any more ideas to properly 'feel this disk at its teeth' ? While an 'end-to-end' test using PG is fine, I think it would be easier to determine if the drive is behaving correctly by using a simple test program that emulates the storage semantics the WAL expects. Have it write a constant stream of records, fsync'ing after each write. Record the highest record number flushed so far in some place that won't be lost with the drive under test (e.g. send it over the network to another machine). Kill the power, bring the system back up again and examine what's at the tail end of that file. I think this will give you the worst case test with the easiest result discrimination. If you want to you could add concurrent random writes to another file for extra realism. Someone here may already have a suitable test program. I know I've written several over the years in order to test I/O performance, prove the existence of kernel bugs, and so on. I doubt it matters much how long the power is turned of. A second should be plenty time to flush pending writes if the drive is going to do so. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Testing Sandforce SSD
Hello list, Probably like many other's I've wondered why no SSD manufacturer puts a small BBU on a SSD drive. Triggered by Greg Smith's mail http://archives.postgresql.org/pgsql-performance/2010-02/msg00291.php here, and also anandtech's review at http://www.anandtech.com/show/2899/1 (see page 6 for pictures of the capacitor) I ordered a SandForce drive and this week it finally arrived. And now I have to test it and was wondering about some things like * How to test for power failure? I thought by running on the same machine a parallel pgbench setup on two clusters where one runs with data and wal on a rotating disk, the other on the SSD, both without BBU controller. Then turn off power. Do that a few times. The problem in this scenario is that even when the SSD would show not data loss and the rotating disk would for a few times, a dozen tests without failure isn't actually proof that the drive can write it's complete buffer to disk after power failure. * How long should the power be turned off? A minute? 15 minutes? * What filesystem to use on the SSD? To minimize writes and maximize chance for seeing errors I'd choose ext2 here. For the sake of not comparing apples with pears I'd have to go with ext2 on the rotating data disk as well. Do you guys have any more ideas to properly 'feel this disk at its teeth' ? regards, Yeb Havinga -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance