On Thu, 21 Mar 2002, Jay Thorne wrote: > The first one I noted was that he assumes that a high performance app has > several joins. I think everyone here who's developed a few db apps will tell > you that joins are hugely costly and should be avoided for an application's > most common cases.
Actually, I've developed _more_ than a few DB apps and I'd tell you that joins are not only _not_ hugely costly, but can sometimes be a performance improvement. It really depends on a lot of factors including what RDBMS you are using, how many connections you have, ratio of reads to writes, how complicated the joins are. But a blanket statement like that is flat out wrong. > I recoded a mysql/mod_perl web app with a couple of joins to a slightly but > not horridly denormalized db structure and realized some huge performance > gains. You removed denormalization, thereby increasing the amount of work necessary to maintain data integrity so the reads may be faster but the writes are definitely slower and more prone to screwing things up. Even worse, you denormalized in favor of _one_ app, obviously, which means any new app you write on top of this denormalized db in the future may suffer because the denormalization doesn't serve it well. > Further on, he was discussing threaded discussion forums and assumed a > recursive query. While recursing the DATA is probably (though provabably not > always) necessary, there is no need to storm the db with that many queries, > since query setup time, queing, and results extraction have noticable time > penalties. Extracting the data with a single query and recursing it in RAM is > a much more db and cpu friendly idea. This I can agree on, though it is more a limitation of SQL, which provides no meaningful way to do a recursive query. > As for performance, mysql and postgresql are getting pretty similar results, > _these_ days. With aggressive app level caching of db data, and the innodb > table handler, I was able to push mysql and mod_perl to 80+ 50kbyte page > views per second on bog standard 650 mhz intel hardware. Again, that will depend on a number of things including complexity of your queries and ratio of reads vs. writes, etc. It's worth noting that MySQL's query optimizer pretty much stinks and will often not use indexes because it simply gives up when queries have an 'OR' in them, for example. What it comes down to is that except for _extremely_ demanding applications, either one is likely to be perfectly suitable so pick the one that you like best, or that has the features you need, or that your sysadmins can support most easily, or whatever. -dave /*================== www.urth.org we await the New Sun ==================*/