Re discussion on limits and sizes... (crossposted to starkit mailing list).

I just did a few tests (from tcl), using hashing on a fast Linux box. Wanted to follow up because I think I gave the impression that millions of rows is a lot...

Well, I don't want to get into actual timings, given that it all depends on everything from etching process used to the phase of the moon anyway, but here's some info which may be of use.

- databases of several hundreds Mb's can work really well
- same for "millions of rows", it's no longer such a big deal
- add many thousands of rows per second even from a scripting interface
- as of MK 2.3/2.4, there are persistent hash indexes, and they really
have O(1) performance (10's/100's of thousands of accesses per second)

There continue to be effects which can surprise you, because column-wise data is something radically, totally, fundamentally different, but you have to keep in mind that these "surprises" can go both ways...

At the C/C++ and Python and Tcl level, be prepared to see performance levels which seem unreal. There will be an order of magnitude difference between a hand-coded C/C++ in-memory hash table and the one MK offers persistently, but that should be about it. I've been seeing hash access rates of 25K searches/sec, using a new experimental Tcl wrapper (which appears to slow it down 4-fold, so there's a lot of leeway for optimization).

I've never taken the time to compare this with other DB's (again: etching and moon phases), but it seems to me that MK packs a lot of oomph.

Please feel free to follow up with teal-life stories and experiences of your own.

-jcw

_______________________________________________
metakit mailing list - [EMAIL PROTECTED]
http://www.equi4.com/mailman/listinfo/metakit

Reply via email to