Hi,
A few months back we asked modperl mailing list on
alternate methods of DB access to postgres (with the same subject). We got some
decent alternatives. We are putting back some of the work we have done on this
issue.
We had a project to program an ad server. This is
not really an OLTP application, i.e., we had a few screens where some data
is captured. Based on this data we had to pick up an advertisement to serve
it.
Essence of the application is to have a highly
scaleable program to deliver ads... which means we wanted a method to be able to
pick ads given a criteria and choose one among them.
We had written a benchmark program, after which we
decided to go for MLDBM for our purposes.
Though this is not directly related to modperl, we
are taking the liberty of posting this message. We hope you find it
useful.
Specification and results of the benchmark is as
follows
Objective : To choose one of the alternate access
methods for an read-only DB program
Program logic :
Choose row from a table which has a composite key
containing 4 attributes.
The 4 attributes which we used are
publishers, size, type and ip number
Given values of these 4 attributes, we get a
list of advertisements for these attributes.
In the live application we will choose one these
ads based on a weighted random number.
For the purpose of benchmark we want to create a
hash or hash reference of the ads given these 4 criteria
Benchmark Data :
Our benchmark data consists of 100
publishers, 3 sizes, 4 types and 20 ip numbers which makes it a data
structure containing 24,000 combination of attributes. Each combination in turn
contains 10 advertisements
Benchmark alternatives :
We have populated this data into
a) A pure in memory multi-level hash : Before
starting the actual benchmark the program populates a multi-level hash... each
of which finally points to the advertisements. Objective is to pick the last
level hash of advertisements
b) Flat file : Create a Linux directory structure
with the same hierarchy as the attributes....i.e., directory structure has
<publishers>/<sizes>/<types>/<ip numbers>. ip numbers is
the file name which contains a list of ads. Objective is to pick the right file,
open this file and create a hash with the contents of the file.
c) Postgres : Create a table with composite primary
key <publisher>,<sizes>,<types>,<ip numbers>.
Objective is to pick an hash reference of all rows given an attribute
combination
d) Storable : Store the multi-level hash into disk
file using Storable.pm. Objective : Read the storable file into
memory and pick last level hash of ads.
e) MLDBM : Populate an MLDBM data structure using
MLDBM. Identical to Storable except we are using MLDBM.pm
H/W : Celeron 433 with 64 MB RAM, IDE HDD using RH
6.1, perl 5.005, Postgres 6.5.3
Benchmark run :
A benchmark run consists of accessing each of the
Benchmark alternatives 40 times. i.e., we generate a random combination of the 4
selection attributes 40 times, access a particular data structure and pick up
the ads.
We repeat this process twice to ensure that we are
getting consistent results
Benchmark results :
hash: 0 wallclock secs ( 0.00 usr +
0.00 sys = 0.00 CPU)
flatfile: 5 wallclock secs ( 0.08 usr + 0.11 sys = 0.19 CPU) postgres: 36 wallclock secs ( 0.04 usr + 0.01 sys = 0.05 CPU) storable: 17 wallclock secs (16.24 usr + 0.61 sys = 16.85 CPU) mldbm: 0 wallclock secs ( 0.08 usr + 0.08 sys = 0.16 CPU) Benchmark interpretation :
We did not want to write this section... but we
have interpreted these results and chosen mldbm.
the first option is not viable for us...
we want to carry forward the values between 2 runs of the program and can't
recreate the hash everytime in the live app.
We had experimented with postgres earlier with
disastrous results. In fact, postgres motivated us to seek
alternatives.
We had a choice between flatfile and MLDBM (from
the results).
It was a close one... but we found MLDBM far more
compact. Users were more comfortable maintaining a system with single file
rather than multitude of files on the file system.
We also suspect the if we add more rows, flatfile
won't scale.... though we've not tested it fully.
At the end .... we have chosen MLDBM. We have
developed the software called opticlik (www.opticlik.com). It's doing just great.
Serving around 1.2 million ads a day. In short bursts it serves upto 100 ads a
second.
We've attached our benchmark program and sql along
with this mail. If any of you have time to run through the program and give us
feedback it would be great. We also welcome and clarifications that may be
required.
Regards,
S Muthu Ganesh & V
Murali
Differentiated Software Solutions Pvt. Ltd.,
90, 3rd Cross,2nd Main, Ganga Nagar, Bangalore - 560 032 Phone : 91 80 3631445, 3431470 Visit us at www.diffsoft.com |
- Re: Fast DB acce... Differentiated Software Solutions Pvt. Ltd
- Re: Fast DB access Perrin Harkins
- Re: Fast DB acce... barries
- Re: Fast DB acce... Differentiated Software Solutions Pvt. Ltd
- Re: Fast DB ... Perrin Harkins
- Re: Fast DB access Tim Sweetman
- Re: Fast DB acce... Differentiated Software Solutions Pvt. Ltd
- Re: Fast DB access Sean D. Cook
- Re: Fast DB access Joe Schaefer
- Re: Fast DB access remco
- Re: Fast DB access Differentiated Software Solutions Pvt. Ltd.,
- Re: Fast DB access Matt Sergeant
- Re: Fast DB access Bruce Albrecht
- Re: Fast DB access clayton
- Re: Fast DB acce... Matthew Kennedy
- Re: Fast DB ... clayton cottingham
- Re: Fas... Dave Hodgkinson
- (OT) Re... Matthew Kennedy
- Re:... clayton cottingham
- Re: Fast DB acce... Clayton Cottingham aka drfrog
- Re: Fast DB ... Wim Kerkhoff