Re: Fast DB access

Differentiated Software Solutions Pvt. Ltd., Wed, 18 Apr 2001 21:50:05 -0700

Hi,

We've continuing this discussions

Reponses to queries raised in the last 24 hours.

WIM > Could you post the SQL statements used to create the tables as well?

See our posting on April 17th. Our attachments have the create table sql too.

CLAYTON > [drfrog]$ perl fast_db.pl
CLAYTON > postgres
CLAYTON >     16 wallclock secs ( 0.05 usr + 0.00 sys = 0.05 CPU) @ 400.00/s (n=20)
CLAYTON > mysql
CLAYTON >      3 wallclock secs ( 0.07 usr + 0.00 sys = 0.07 CPU) @ 285.71/s (n=20)
CLAYTON > postgres
CLAYTON >     17 wallclock secs ( 0.06 usr + 0.00 sys = 0.06 CPU) @ 333.33/s (n=20)
CLAYTON > mysql
CLAYTON >     3 wallclock secs ( 0.01 usr + 0.01 sys = 0.02 CPU) @ 1000.00/s (n=20)

MATHEW > Again, checkout PostgreSQL 7.1 -- I believe "commit" and "rollback" (as
MATHEW > you put it) are available. BTW, I would like to see that comment about
MATHEW > MS-Access posted to pgsql-general... I dare ya. :P

We were saying the mySQL is a shocker that they were justifying lack of commit and rollback. We have no complaints with pg on the features front.

Several people have recommended pg 7.1

We take this as valid feedback. We'll install and use pg 7.1

MATHEW > I'm on several postgresql mailing lists and couldn't find a recent post
MATHEW > from you complaining about 6.5.3 performance problems (not even by an
MATHEW > archive search). Your benchmark is worthless until you try postgresql
MATHEW > 7.1. There have been two major releases of postgresql since 6.5.x (ie.
MATHEW > 7.0 and 7.1) and several minor ones over a total of 2-3 years. It's no
MATHEW > secret that they have tremendous performance improvements over 6.5.x. So
MATHEW > why did you benchmark 6.5.x?

I've not posted anything to postgres newsgroups for a long... time. I was too cheesed off. They kept defending postgres without accepting/solving problems. Let's not go into this

We are as of now ignoring any discussions into Oracle... etc., We would be glad to hear more suggestions on our benchmark.

Several people complain that this is not a fair test. We are not professionals in benchmarking. Rather we are software developers using benchmarks as a way of choosing among alternatives.

If people have specific suggestions on ways of improving our benchmark we will be very happy.

Also, welcome are links on how to design and run these benchmarks for amateurs like us.

Thanks and Regards,

S Muthu Ganesh & V Murali

Differentiated Software Solutions Pvt. Ltd.,
90, 3rd Cross,2nd Main,
Ganga Nagar,
Bangalore - 560 032
Phone : 91 80 3631445, 3431470
Visit us at www.diffsoft.com

----- Original Message -----

From: Differentiated Software Solutions Pvt. Ltd.,

To: [EMAIL PROTECTED]

Sent: Tuesday, April 17, 2001 4:41 PM

Subject: Fast DB access

Hi,

A few months back we asked modperl mailing list on alternate methods of DB access to postgres (with the same subject). We got some decent alternatives. We are putting back some of the work we have done on this issue.

We had a project to program an ad server. This is not really an OLTP application, i.e., we had a few screens where some data is captured. Based on this data we had to pick up an advertisement to serve it.

Essence of the application is to have a highly scaleable program to deliver ads... which means we wanted a method to be able to pick ads given a criteria and choose one among them.

We had written a benchmark program, after which we decided to go for MLDBM for our purposes.

Though this is not directly related to modperl, we are taking the liberty of posting this message. We hope you find it useful.

Specification and results of the benchmark is as follows

Objective : To choose one of the alternate access methods for an read-only DB program

Program logic :

Choose row from a table which has a composite key containing 4 attributes.

The 4 attributes which we used are publishers, size, type and ip number

Given values of these 4 attributes, we get a list of advertisements for these attributes.

In the live application we will choose one these ads based on a weighted random number.

For the purpose of benchmark we want to create a hash or hash reference of the ads given these 4 criteria

Benchmark Data :

Our benchmark data consists of 100 publishers, 3 sizes, 4 types and 20 ip numbers which makes it a data structure containing 24,000 combination of attributes. Each combination in turn contains 10 advertisements

Benchmark alternatives :

We have populated this data into

a) A pure in memory multi-level hash : Before starting the actual benchmark the program populates a multi-level hash... each of which finally points to the advertisements. Objective is to pick the last level hash of advertisements

b) Flat file : Create a Linux directory structure with the same hierarchy as the attributes....i.e., directory structure has <publishers>/<sizes>/<types>/<ip numbers>. ip numbers is the file name which contains a list of ads. Objective is to pick the right file, open this file and create a hash with the contents of the file.

c) Postgres : Create a table with composite primary key <publisher>,<sizes>,<types>,<ip numbers>. Objective is to pick an hash reference of all rows given an attribute combination

d) Storable : Store the multi-level hash into disk file using Storable.pm. Objective : Read the storable file into memory and pick last level hash of ads.

e) MLDBM : Populate an MLDBM data structure using MLDBM. Identical to Storable except we are using MLDBM.pm

H/W : Celeron 433 with 64 MB RAM, IDE HDD using RH 6.1, perl 5.005, Postgres 6.5.3

Benchmark run :

A benchmark run consists of accessing each of the Benchmark alternatives 40 times. i.e., we generate a random combination of the 4 selection attributes 40 times, access a particular data structure and pick up the ads.

We repeat this process twice to ensure that we are getting consistent results

Benchmark results :

hash: 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
flatfile: 5 wallclock secs ( 0.08 usr + 0.11 sys = 0.19 CPU)
postgres: 36 wallclock secs ( 0.04 usr + 0.01 sys = 0.05 CPU)
storable: 17 wallclock secs (16.24 usr + 0.61 sys = 16.85 CPU)
mldbm: 0 wallclock secs ( 0.08 usr + 0.08 sys = 0.16 CPU)

Benchmark interpretation :

We did not want to write this section... but we have interpreted these results and chosen mldbm.

the first option is not viable for us... we want to carry forward the values between 2 runs of the program and can't recreate the hash everytime in the live app.

We had experimented with postgres earlier with disastrous results. In fact, postgres motivated us to seek alternatives.

We had a choice between flatfile and MLDBM (from the results).

It was a close one... but we found MLDBM far more compact. Users were more comfortable maintaining a system with single file rather than multitude of files on the file system.

We also suspect the if we add more rows, flatfile won't scale.... though we've not tested it fully.

At the end .... we have chosen MLDBM. We have developed the software called opticlik (www.opticlik.com). It's doing just great. Serving around 1.2 million ads a day. In short bursts it serves upto 100 ads a second.

We've attached our benchmark program and sql along with this mail. If any of you have time to run through the program and give us feedback it would be great. We also welcome and clarifications that may be required.

Regards,

S Muthu Ganesh & V Murali

Differentiated Software Solutions Pvt. Ltd.,
90, 3rd Cross,2nd Main,
Ganga Nagar,
Bangalore - 560 032
Phone : 91 80 3631445, 3431470
Visit us at www.diffsoft.com

Re: Fast DB access

Reply via email to