Thanks for pointing out the mistake in postgres. Your Advice makes lots of sense. We will recreate the benchmark and post the results V Murali Differentiated Software Solutions Pvt. Ltd., 90, 3rd Cross,2nd Main, Ganga Nagar, Bangalore - 560 032 Phone : 91 80 3631445, 3431470 Visit us at www.diffsoft.com > ----- Original Message ----- > From: Cees Hek <[EMAIL PROTECTED]> > To: Murali V <[EMAIL PROTECTED]> > Cc: <[EMAIL PROTECTED]> > Sent: Friday, April 20, 2001 1:45 AM > Subject: [OT] Re: Fast DB access > > > > > > On Thu, 19 Apr 2001, Murali V wrote: > > > > > Hi, > > > > > > If you read the code more deeply, you'll find that the timeit is only > > > wrapped around select and not around insert. > > > We've written the insert code so that in the first round you can > populate > > > the database. > > > You comment out the insert code after the first round and run the > benchmark > > > several times. This would only do select and time select. > > > > > > > Hi Murali, > > > > OK, to start off, I was not specifically aiming my rant at you, I was > > replying to someone who had modified your code and was now comparing MySQL > > and PostgreSQL, and he was implying that the timings were for inserts and > > selects. I took this at face value, and didn't check the code close > > enough which I really should have done in the first place. > > > > > Connecting this error to an axiom that "Benchmarks are useless" is bad > > > indeed. Shouldn't we be ironing out errors and runing benchmarks which > are > > > good. > > > > Perhaps I should have said published benchmarks. In your case, you are > > using benchmarks for exactly what they are intended for... Creating a > > system that closely resembles your application and putting it through it's > > paces. What I find dangerous about publishing benchmarks, is that they > > are almost always heavily swayed to a specific application, and most of > > the time they show what the user wants them to show. > > > > In your original message, you clain to have a bias against Postgres, and > > your benchmark shows that bias. I however am a happy user of postgres, > > and am therefore biased towards it. I modified your benchmark script > > slightly, and I got the following results (I have include a diff of my > > changes at the bottom): > > > > postgres > > 0 wallclock secs ( 0.02 usr + 0.01 sys = 0.03 CPU) > > postgres > > 0 wallclock secs ( 0.02 usr + 0.00 sys = 0.02 CPU) > > > > Whereas if I run it with your version I get the following: > > > > postgres > > 27 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) > > postgres > > 27 wallclock secs ( 0.02 usr + 0.00 sys = 0.02 CPU) > > > > So what does that tell you about the benchmark? that the postgres part of > > this benchmark is useless... It may have given you the answer that you > > wanted, but it is misleading to anyone else out there. > > > > This is why there are always flame wars about benchmarking databases (by > > the way I think this whole thread has been very civilized and i hope is > > stays that way). Invariably the benchmark has missed some critical idea > > or optimization which drastically skew the results. > > > > > Your recommendation is to pick a DB best suited to your app. But How ?? > > > a) Either by hiring a guru who has seen all kinds of apps with different > DBs > > > who can give you the answer with which we can run > > > b) Run a benchmark on critical programs which represent you app across > > > databases and find what performs best. > > > I've read too much literature on DB features. All DBs have all features > > > (except MySQL which does not have commit !!!!) > > > You can't make a thing out of DB literature. > > > > What I would recommend is exactly what you have done in this case. Get > > access to any and all the systems that you feel may do the job for you , > > and try them out. Browse the web for other users experiences, but don't > > use other peoples benchmarks, because the odds are good that they are > > wrong... Create your own, or modify an existing one, and scrutinize > > exactly what it is doing. And if you want to share your results with > > anyone else, tell them what you choose in the end, and why. Tell them you > > choose database x because it did this and this for you. Don't say > > database y is a piece of crap, so we went with database x. > > > > But whatever you do, don't choose your database based on other peoples > > benchmarks........ (that is all I'm trying to say, and I guess I didn't > > say it clearly enough) > > > > When I first read your message, I tucked it away somewhere, so I could > > reference it again in the future, because I was interested in the MLDBM > > work that you had done, and I thank you for that. But it also made me > > think that maybe I shouldn't be using Postgres, because your results were > > so poor (only for a second or too though :). But I'll bet that a lot of > > people who have never used postgres before are now less likely to download > > it and try it out for themself, because a benchmark swayed them away from > > it. That sounds like a good closer, so I'll stop it there :-) > > > > > > Cees > > > > > > Here is the diff of my changes and a quick comment on why your way kills > > the performance of postgres > > > > *************** > > *** 124,131 **** > > $i_ip = int(rand(20)); > > > > @row_ary = $dbh->selectrow_array("select crr from > benchmark where > > ! rtrim(pub) = 'pub$i_pub' and rtrim(size) = > 'size$i_size' and > > ! rtrim(type) = 'type$i_type' and rtrim(ip) > = 'ip$i_ip'"); > > } > > }; > > > > --- 124,131 ---- > > $i_ip = int(rand(20)); > > > > @row_ary = $dbh->selectrow_array("select crr from > benchmark where > > ! pub = 'pub$i_pub' and size = 'size$i_size' > and > > ! type = 'type$i_type' and ip = 'ip$i_ip'"); > > } > > }; > > > > All I did was remove the 'rtrim' called from the query. I'm assuming you > > did this because Postgres pads all char fields with spaces (varchar > > doesn't do this). What your query ends up doing, is running rtrim on > > every field in the column and then comparing it against your provided > > value (indeces are not used since you are changing the value of the > > primary key). So since your table has 24000 rows and you call rtrim on 4 > > columns in the select, it has a worst case of 24,000 x 4 calls to rtrim > > (of course postgres will short circuit the conditions if the first call > > fails), but in the best case it still has to do 24,000 calls to rtrim. > > I'm amazed that postgres can actually do this this fast :). > > > > In Postgres 7.0 you don't have to worry about the space padding issue, so > > I was able to remove the rtrims from the query. However since you are > > using 6.5.x, what you should have done, is either make your columns > > varchar instead of char, or space pad your values in the query, so that > > you are checking pub = 'pub1 ' instead of rtrim(pub) = 'pub1'. i > > > > However, since you were looking at solutions for a new application, you > > have no excuse for using a database that is several years old, when > > several new stable revisions are available (for free)... > > > > Give it a try and see what happens... > > > > > > > > >
Fw: [OT] Re: Fast DB access
Differentiated Software Solutions Pvt. Ltd., Thu, 19 Apr 2001 20:14:24 -0700
- [OT] Re: Fast DB access Cees Hek
- Re: [OT] Fast DB access Differentiated Software Solutions Pvt. Ltd.,
- Re: [OT] Re: Fast DB acc... Murali V
- Differentiated Software Solutions Pvt. Ltd.,