[PERFORM] Sorted group by
I'm trying to eke a little bit more performance out of an application, and I was wondering if there was a better way to do the following: I am trying to retrieve, for many sets of rows grouped on a couple of fields, the value of an ungrouped field where the row has the highest value in another ungrouped field. For instance, I have the following table setup: group | whatever type value | whatever type number | int Index: group I then have rows like this: group | value | number - Foo | foo | 1 Foo | turnips | 2 Bar | albatross | 3 Bar | monkey| 4 I want to receive results like this: group | value --- Foo | turnips Bar | monkey Currently, I do this in my application by ordering by the number and only using the last value. I imagine that this is something that can be done in the new Postgres 9, with a sorted group by - something like this: SELECT group, LAST(value, ORDER BY number) FROM table GROUP BY group Is this something that is already built in, or would I have to write my own LAST aggregate function? Matthew -- The third years are wandering about all worried at the moment because they have to hand in their final projects. Please be sympathetic to them, say things like ha-ha-ha, but in a sympathetic tone of voice -- Computer Science Lecturer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Sorted group by
On Tue, 10 Aug 2010, Thomas Kellerer wrote: No. It's built in (8.4) and it's called Windowing functions: http://www.postgresql.org/docs/8.4/static/tutorial-window.html http://www.postgresql.org/docs/8.4/static/functions-window.html SELECT group, last_value(value) over(ORDER BY number) FROM table I may be mistaken, but as I understand it, a windowing function doesn't reduce the number of rows in the results? Matthew -- Don't worry about people stealing your ideas. If your ideas are any good, you'll have to ram them down people's throats. -- Howard Aiken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Sorted group by
On 10 August 2010 17:03, Matthew Wakeling matt...@flymine.org wrote: On Tue, 10 Aug 2010, Thomas Kellerer wrote: No. It's built in (8.4) and it's called Windowing functions: http://www.postgresql.org/docs/8.4/static/tutorial-window.html http://www.postgresql.org/docs/8.4/static/functions-window.html SELECT group, last_value(value) over(ORDER BY number) FROM table I may be mistaken, but as I understand it, a windowing function doesn't reduce the number of rows in the results? I think you are mistaken. The last_value function is a window function aggregate. Give it a try. -- Thom Brown Registered Linux user: #516935 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Sorted group by
On Tue, Aug 10, 2010 at 04:40:16PM +0100, Matthew Wakeling wrote: I'm trying to eke a little bit more performance out of an application, and I was wondering if there was a better way to do the following: I am trying to retrieve, for many sets of rows grouped on a couple of fields, the value of an ungrouped field where the row has the highest value in another ungrouped field. For instance, I have the following table setup: group | whatever type value | whatever type number | int Index: group I then have rows like this: group | value | number - Foo | foo | 1 Foo | turnips | 2 Bar | albatross | 3 Bar | monkey| 4 I want to receive results like this: group | value --- Foo | turnips Bar | monkey Currently, I do this in my application by ordering by the number and only using the last value. I imagine that this is something that can be done in the new Postgres 9, with a sorted group by - something like this: SELECT group, LAST(value, ORDER BY number) FROM table GROUP BY group Is this something that is already built in, or would I have to write my own LAST aggregate function? this is trivially done when usign 'distinct on': select distinct on (group) * from table order by group desc, number desc; depesz -- Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/ jid/gtalk: dep...@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
Scott Carey wrote: Also, the amount of data at risk in a power loss varies between drives. For Intel's drives, its a small chunk of data ( 256K). For some other drives, the cache can be over 30MB of outstanding writes. For some workloads this is acceptable No, it isn't ever acceptable. You can expect the type of data loss you get when a cache fails to honor write flush calls results in catastrophic database corruption. It's not I lost the last few seconds; it's the database is corrupted and won't start after a crash. This is why we pound on this topic on this list. A SSD that fails to honor flush requests is completely worthless for anything other than toy databases. You can expect significant work to recover any portion of your data after the first unexpected power loss under heavy write load in this environment, during which you're down. We do database corruption recovery at 2ndQuadrant; while I can't talk about the details of some recent incidents, I am not speaking theoretically when I warn about this. Michael, I would suggest you read http://www.postgresql.org/docs/current/static/wal-reliability.html and link to it at the end of your article. You are recommending that people consider a configuration that will result in their data being lost. That can be acceptable, if for example your data is possible to recreate from backups or the like. But people should be extremely clear that trade-off is happening, and your blog post is not doing that yet. Part of the reason for the bang per buck you're seeing here is that cheap SSDs are cheating. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Sorted group by
Matthew Wakeling wrote on 10.08.2010 18:03: On Tue, 10 Aug 2010, Thomas Kellerer wrote: No. It's built in (8.4) and it's called Windowing functions: http://www.postgresql.org/docs/8.4/static/tutorial-window.html http://www.postgresql.org/docs/8.4/static/functions-window.html SELECT group, last_value(value) over(ORDER BY number) FROM table I may be mistaken, but as I understand it, a windowing function doesn't reduce the number of rows in the results? Yes you are right, a bit too quick on my side ;) But this might be what you are after then: select group_, value_ from ( select group_, value_, number_, row_number() over (partition by group_ order by value_ desc) as row_num from numbers ) t where row_num = 1 order by group_ desc -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
Jeff Davis wrote: Depending on which 256K you lose, you might as well lose your entire database. Let's be nice and assume that you only lose one 8K block because of the SSD write cache; that's not so bad, right? Guess what--you could easily be the next lucky person who discovers the block corrupted is actually in the middle of the pg_class system catalog, where the list of tables in the database is at! Enjoy getting your data back again with that piece missing. It's really a fun time, I'll tell you that. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Sorted group by
On 10 August 2010 17:06, Thom Brown t...@linux.com wrote: On 10 August 2010 17:03, Matthew Wakeling matt...@flymine.org wrote: On Tue, 10 Aug 2010, Thomas Kellerer wrote: No. It's built in (8.4) and it's called Windowing functions: http://www.postgresql.org/docs/8.4/static/tutorial-window.html http://www.postgresql.org/docs/8.4/static/functions-window.html SELECT group, last_value(value) over(ORDER BY number) FROM table I may be mistaken, but as I understand it, a windowing function doesn't reduce the number of rows in the results? I think you are mistaken. The last_value function is a window function aggregate. Give it a try. D'oh, no, I'm mistaken. My brain has been malfunctioning today. -- Thom Brown Registered Linux user: #516935 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Sorted group by
Matthew Wakeling matt...@flymine.org wrote: I'm trying to eke a little bit more performance out of an application In addition to the suggestion from Thomas Kellerer, it would be interesting to try the following and see how performance compares using real data. select group, value from tbl x where not exists (select * from tbl y where y.group = x.group and y.number x.number); We have a lot of code using this general technique, and I'm curious whether there are big gains to be had by moving to the windowing functions. (I suspect there are.) -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Sorted group by
Another couple of possible ways: Select groupfield,value From tbl x1 Where number = (select max(number) from tbl x2 where x2.groupfield= x1.groupfield) Select groupfield,value From tbl x1 Where (groupfield,number) in (select groupfield,max(number) from tbl group by groupfield) Which is quickest? Probably best to try out and see. -Original Message- From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Kevin Grittner Sent: Tuesday, August 10, 2010 7:38 PM To: Matthew Wakeling; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Sorted group by Matthew Wakeling matt...@flymine.org wrote: I'm trying to eke a little bit more performance out of an application In addition to the suggestion from Thomas Kellerer, it would be interesting to try the following and see how performance compares using real data. select group, value from tbl x where not exists (select * from tbl y where y.group = x.group and y.number x.number); We have a lot of code using this general technique, and I'm curious whether there are big gains to be had by moving to the windowing functions. (I suspect there are.) -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.851 / Virus Database: 271.1.1/3061 - Release Date: 08/09/10 21:35:00 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
On 8/10/2010 12:21 PM, Greg Smith wrote: Scott Carey wrote: Also, the amount of data at risk in a power loss varies between drives. For Intel's drives, its a small chunk of data ( 256K). For some other drives, the cache can be over 30MB of outstanding writes. For some workloads this is acceptable No, it isn't ever acceptable. You can expect the type of data loss you get when a cache fails to honor write flush calls results in catastrophic database corruption. It's not I lost the last few seconds; it's the database is corrupted and won't start after a crash. This is why we pound on this topic on this list. A SSD that fails to honor flush requests is completely worthless for anything other than toy databases. You can expect significant work to recover any portion of your data after the first unexpected power loss under heavy write load in this environment, during which you're down. We do database corruption recovery at 2ndQuadrant; while I can't talk about the details of some recent incidents, I am not speaking theoretically when I warn about this. What about putting indexes on them? If the drive fails and drops writes on those, they could be rebuilt - assuming your system can function without the index(es) temporarily. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
Brad Nicholson wrote: On 8/10/2010 12:21 PM, Greg Smith wrote: Scott Carey wrote: Also, the amount of data at risk in a power loss varies between drives. For Intel's drives, its a small chunk of data ( 256K). For some other drives, the cache can be over 30MB of outstanding writes. For some workloads this is acceptable No, it isn't ever acceptable. You can expect the type of data loss you get when a cache fails to honor write flush calls results in catastrophic database corruption. It's not I lost the last few seconds; it's the database is corrupted and won't start after a crash. This is why we pound on this topic on this list. A SSD that fails to honor flush requests is completely worthless for anything other than toy databases. You can expect significant work to recover any portion of your data after the first unexpected power loss under heavy write load in this environment, during which you're down. We do database corruption recovery at 2ndQuadrant; while I can't talk about the details of some recent incidents, I am not speaking theoretically when I warn about this. What about putting indexes on them? If the drive fails and drops writes on those, they could be rebuilt - assuming your system can function without the index(es) temporarily. You could put indices on them but as noted by Scott, he's SPOT ON. ANY disk that says write is complete when it really is not is entirely unsuitable for ANY real database use. It is simply a matter of time before you have a failure of some sort that results in catastrophic data loss. If you're LUCKY the database won't start and you know you're in trouble. If you're UNLUCKY the database DOES start but there is undetected and unrecoverable data corruption somewhere inside the data tables, which you WILL discover at the most-inopportune moment (like when you desperately NEED that business record for some reason.) You cannot put either the tables or the logs on such a drive without running the risk of a data corruption problem that WILL lose data and MAY be catastrophic, depending on exactly what fails when. While it is possible to recover that which is not damaged from a database that has corruption like this it simply is not possible to recover data that never made it to the disk - no matter what you do - and the time and effort expended (not to mention money if you have to bring in someone with specialized skills you do not possess) that result from such decisions when things go wrong are extreme. Don't do it. -- Karl attachment: karl.vcf -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
On Tue, Aug 10, 2010 at 12:13 PM, Karl Denninger k...@denninger.net wrote: ANY disk that says write is complete when it really is not is entirely unsuitable for ANY real database use. It is simply a matter of time What about read only slaves where there's a master with 100+spinning hard drives getting it right and you need a half dozen or so read slaves? I can imagine that being ok, as long as you don't restart a server after a crash without checking on it. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
Brad Nicholson wrote: What about putting indexes on them? If the drive fails and drops writes on those, they could be rebuilt - assuming your system can function without the index(es) temporarily. Dumping indexes on SSD is one of the better uses for them, presuming you can survive what is likely to be an outage from a can the site handle full load? perspective while they rebuild after a crash. As I'm sure Brad is painfully aware of already, index rebuilding in PostgreSQL can take a while. To spin my broken record here again, the main thing to note when you consider that--relocate indexes onto SSD--is that the ones you are most concerned about the performance of were likely to be already sitting in RAM anyway, meaning the SSD speedup doesn't help reads much. So the giant performance boost just isn't there in that case. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
Scott Marlowe wrote: On Tue, Aug 10, 2010 at 12:13 PM, Karl Denninger k...@denninger.net wrote: ANY disk that says write is complete when it really is not is entirely unsuitable for ANY real database use. It is simply a matter of time What about read only slaves where there's a master with 100+spinning hard drives getting it right and you need a half dozen or so read slaves? I can imagine that being ok, as long as you don't restart a server after a crash without checking on it. A read-only slave isn't read-only, is it? I mean, c'mon - how does the data get there? IF you mean a server that only accepts SELECTs, does not accept UPDATEs or INSERTs, and on a crash **reloads the entire database from the master**, then ok. Most people who will do this won't reload it after a crash. They'll inspect the database and say ok, and put it back online. Bad Karma will ensue in the future. Incidentally, that risk is not theoretical either (I know about this one from hard experience. Fortunately the master was still ok and I was able to force a full-table copy I didn't like it as the database was a few hundred GB, but I had no choice.) -- Karl attachment: karl.vcf -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
On 8/10/2010 2:28 PM, Greg Smith wrote: Brad Nicholson wrote: What about putting indexes on them? If the drive fails and drops writes on those, they could be rebuilt - assuming your system can function without the index(es) temporarily. Dumping indexes on SSD is one of the better uses for them, presuming you can survive what is likely to be an outage from a can the site handle full load? perspective while they rebuild after a crash. As I'm sure Brad is painfully aware of already, index rebuilding in PostgreSQL can take a while. To spin my broken record here again, the main thing to note when you consider that--relocate indexes onto SSD--is that the ones you are most concerned about the performance of were likely to be already sitting in RAM anyway, meaning the SSD speedup doesn't help reads much. So the giant performance boost just isn't there in that case. The case where I'm thinking they may be of use is for indexes you can afford to lose. I'm thinking of ones that are needed by nightly batch jobs, down stream systems or reporting - the sorts of things that you can turn off during a rebuild, and where the data sets are not likely to be in cache. We have a few such cases, but we don't need the speed of SSD's for them. Personally, I wouldn't entertain any SSD with a capacitor backing it for anything, even indexes. Not worth the hassle to me. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
On 8/10/2010 2:38 PM, Karl Denninger wrote: Scott Marlowe wrote: On Tue, Aug 10, 2010 at 12:13 PM, Karl Denningerk...@denninger.net wrote: ANY disk that says write is complete when it really is not is entirely unsuitable for ANY real database use. It is simply a matter of time What about read only slaves where there's a master with 100+spinning hard drives getting it right and you need a half dozen or so read slaves? I can imagine that being ok, as long as you don't restart a server after a crash without checking on it. A read-only slave isn't read-only, is it? I mean, c'mon - how does the data get there? A valid case is a Slony replica if used for query offloading (not for DR). It's considered a read-only subscriber from the perspective of Slony as only Slony can modify the data (although you are technically correct, it is not read only - controlled write may be more accurate). In case of failure, a rebuild + resubscribe gets you back to the same consistency. If you have high IO requirements, and don't have the budget to rack up extra disk arrays to meet them, it could be an option. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp.
Re: [PERFORM] Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
Brad Nicholson wrote: On 8/10/2010 2:38 PM, Karl Denninger wrote: Scott Marlowe wrote: On Tue, Aug 10, 2010 at 12:13 PM, Karl Denninger k...@denninger.net wrote: ANY disk that says write is complete when it really is not is entirely unsuitable for ANY real database use. It is simply a matter of time What about read only slaves where there's a master with 100+spinning hard drives getting it right and you need a half dozen or so read slaves? I can imagine that being ok, as long as you don't restart a server after a crash without checking on it. A read-only slave isn't read-only, is it? I mean, c'mon - how does the data get there? A valid case is a Slony replica if used for query offloading (not for DR). It's considered a read-only subscriber from the perspective of Slony as only Slony can modify the data (although you are technically correct, it is not read only - controlled write may be more accurate). In case of failure, a rebuild + resubscribe gets you back to the same consistency. If you have high IO requirements, and don't have the budget to rack up extra disk arrays to meet them, it could be an option. CAREFUL with that model and beliefs. Specifically, the following will hose you without warning: 1. SLONY gets a change on the master. 2. SLONY commits it to the (read-only) slave. 3. Confirmation comes back to the master that the change was propagated. 4. Slave CRASHES without actually committing the changed data to stable storage. When the slave restarts it will not know that the transaction was lost. Neither will the master, since it was told that it was committed. Slony will happily go on its way and replicate forward, without any indication of a problem - except that on the slave, there are one or more transactions that are **missing**. Some time later you issue an update that goes to the slave, but the change previously lost causes the slave commit to violate referential integrity. SLONY will fail to propagate that change and all behind it - it effectively locks at that point in time. You can recover from this by dropping the slave from replication and re-inserting it, but that forces a full-table copy of everything in the replication set. The bad news is that the queries to the slave in question may have been returning erroneous data for some unknown period of time prior to the lockup in replication (which hopefully you detect reasonably quickly - you ARE watching SLONY queue depth with some automated process, right?) I can both cause this in the lab and have had it happen in the field. It's a nasty little problem that bit me on a series of disks that claimed to have write caching off, but in fact did not. I was very happy that the data on the master was good at that point, as if I had needed to failover to the slave (thinking it was a good copy) I would have been in SERIOUS trouble. -- Karl attachment: karl.vcf -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
On 8/10/2010 3:28 PM, Karl Denninger wrote: Brad Nicholson wrote: On 8/10/2010 2:38 PM, Karl Denninger wrote: Scott Marlowe wrote: On Tue, Aug 10, 2010 at 12:13 PM, Karl Denningerk...@denninger.net wrote: ANY disk that says write is complete when it really is not is entirely unsuitable for ANY real database use. It is simply a matter of time What about read only slaves where there's a master with 100+spinning hard drives getting it right and you need a half dozen or so read slaves? I can imagine that being ok, as long as you don't restart a server after a crash without checking on it. A read-only slave isn't read-only, is it? I mean, c'mon - how does the data get there? A valid case is a Slony replica if used for query offloading (not for DR). It's considered a read-only subscriber from the perspective of Slony as only Slony can modify the data (although you are technically correct, it is not read only - controlled write may be more accurate). In case of failure, a rebuild + resubscribe gets you back to the same consistency. If you have high IO requirements, and don't have the budget to rack up extra disk arrays to meet them, it could be an option. CAREFUL with that model and beliefs. Specifically, the following will hose you without warning: 1. SLONY gets a change on the master. 2. SLONY commits it to the (read-only) slave. 3. Confirmation comes back to the master that the change was propagated. 4. Slave CRASHES without actually committing the changed data to stable storage. What will hose you is assuming that your data will be okay in the case of a failure, which is a very bad assumption to make in the case on unreliable SSD's. You are assuming I am implying that these should be treated like reliable media - I am not. In case of failure, you need to assume data loss until proven otherwise. If there is a problem, rebuild. When the slave restarts it will not know that the transaction was lost. Neither will the master, since it was told that it was committed. Slony will happily go on its way and replicate forward, without any indication of a problem - except that on the slave, there are one or more transactions that are **missing**. Correct. Some time later you issue an update that goes to the slave, but the change previously lost causes the slave commit to violate referential integrity. SLONY will fail to propagate that change and all behind it - it effectively locks at that point in time. It will lock data flow to that subscriber, but not to others. You can recover from this by dropping the slave from replication and re-inserting it, but that forces a full-table copy of everything in the replication set. The bad news is that the queries to the slave in question may have been returning erroneous data for some unknown period of time prior to the lockup in replication (which hopefully you detect reasonably quickly - you ARE watching SLONY queue depth with some automated process, right?) There are ways around that - run two subscribers and redirect your queries on failure. Don't bring up the failed replica until it is verified or rebuilt. I can both cause this in the lab and have had it happen in the field. It's a nasty little problem that bit me on a series of disks that claimed to have write caching off, but in fact did not. I was very happy that the data on the master was good at that point, as if I had needed to failover to the slave (thinking it was a good copy) I would have been in SERIOUS trouble. It's very easy to cause those sorts of problems. What I am saying is that the technology can have a use, if you are aware of the sharp edges, and can both work around them and live with them. Everything you are citing is correct, but is more implying that they they are blindly thrown in without understanding the risks and mitigating them. I'm also not suggesting that this is a configuration I would endorse, but it could potentially save a lot of money in certain use cases. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp.
Re: [PERFORM] Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
On Tue, Aug 10, 2010 at 12:38 PM, Karl Denninger k...@denninger.net wrote: Scott Marlowe wrote: On Tue, Aug 10, 2010 at 12:13 PM, Karl Denninger k...@denninger.net wrote: ANY disk that says write is complete when it really is not is entirely unsuitable for ANY real database use. It is simply a matter of time What about read only slaves where there's a master with 100+spinning hard drives getting it right and you need a half dozen or so read slaves? I can imagine that being ok, as long as you don't restart a server after a crash without checking on it. A read-only slave isn't read-only, is it? I mean, c'mon - how does the data get there? Well, duh. However, what I'm looking at is having two big servers in failover running on solid reliable hardware, and then a small army of read only slony slaves that are used for things like sending user rss feeds and creating weekly reports and such. These 1U machines with 12 to 24 cores and a single SSD drive are disposable in terms that if they ever crash, there's a simple script to run that reinits the db and then subscribes them to the set. My point being, no matter how terrible an idea a certain storage media is, there's always a use case for it. Even if it's very narrow. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
On Tue, Aug 10, 2010 at 3:52 PM, Scott Marlowe scott.marl...@gmail.com wrote: My point being, no matter how terrible an idea a certain storage media is, there's always a use case for it. Even if it's very narrow. The trouble is, if extra subscribers induce load on the master, which they presumably will, then that sliver of use case may very well get obscured by the cost, such that the sliver should be treated as not existing :-(. -- http://linuxfinances.info/info/linuxdistributions.html -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
On Tue, Aug 10, 2010 at 2:00 PM, Christopher Browne cbbro...@gmail.com wrote: On Tue, Aug 10, 2010 at 3:52 PM, Scott Marlowe scott.marl...@gmail.com wrote: My point being, no matter how terrible an idea a certain storage media is, there's always a use case for it. Even if it's very narrow. The trouble is, if extra subscribers induce load on the master, which they presumably will, then that sliver of use case may very well get obscured by the cost, such that the sliver should be treated as not existing :-(. One master, one slave, master handles all writes, slave handles all of the other subscribers. I've run a setup like this with as many as 8 or so slaves at the bottom of the pile with no problems at all. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance