Perrin Harkins wrote:

simran wrote:

I need to be able to say:

* Lookup the _distance_ for the planet _mercury_ on the date _1900-01-01_


On the face of it, a relational database is best for that kind of query. However, if you won't get any fancier than that, you can get by with MLDBM or something similar.

Currently i do this using a postgres database, however, my question is,
is there a quicker way to do this in mod_perl - would a DB_File or some
other structure be better?

Query speed comes into question only when there is a heavy use. Postgress has an 'Explain' facility via pgsql. Just add Explain before the query and you will get the cost of the query. By creating proper indexes you can get good optimization. What if you add a table later and you need to join that with the planet table? If you keep your planet data somewhere else, then the access becomes cumbersome as well as slower. There are many ways to speed up Postgresql. I recommend the Posgresql book by Korryand Susan Douglas. I got it from Barnes and Nobles. IMHO stay with the relational database you are on and find ways to optimize.

A DBM file will be faster. What you can do is build a key out of planet + date, so that you grab the right record with a single access. Either use MLDBM for storing hashes inside each record, or just a simple join/split approach.


This would be a good idea if you are implementing your tool and you know what limitations you will be subjected to.
MySQL would probably also be faster than PostgreSQL for this kind of simple read-only querying, but not as fast as a DBM file. SDBM_File is the fastest DBM around, if you can live with the space limitations it has.

perhaps something such as copying the whole 800,000 rows to
memory (as a hash?) on apache startup?

Postgresql may have a way to 'stick' a table in memory like MySQL.

That would be the fastest by far, but it will use a boatload of RAM. It's pretty easy to try, so test it and see if you can spare the RAM it requires.


- Perrin




Reply via email to