Brannon King wrote:
John Stanton wrote:

You don't seem to need a data manipulation system like Sqlite, more a form of high volume storage. Do you really need elaborate SQL, journalling, ROLLBACK and assured disk storage?

Di you consider some form of hashed storage, perhaps linear hashing, to build a compact and high performance associative array for your sparsely keyed data.

Do you really need the overhead of B-trees is you are just storing a sparse array?
JS

I don't need journaling or rollback. I'd love a way to shut them off. But elaborate SQL, that sure is handy. I'm not just storing, I'm viewing stored, compressed data. I definitely need some way of querying a sparse matrix data that is larger than my DRAM. Sqlite sure seems like the quickest route to a workable product for that to happen. It has all the streaming/caching built in. Because of that, I assume it is faster than random file access. It supports complex data queries and indexes, both things I would need anyway. In the world of programming, I think many will agree you should get a working product, then make it faster. I'm just trying to get the most speed out of the easiest tool. If I need to rewrite the file storage for the next version, we can consider the cost to benefit for that separately.

I saw you performance requirements and data rate, which looks difficult to achieve when you are writing journals and ensuring the integrity of disk records.

You will find that Sqlite is much slower than random file access, because Sqlite is built on top of random file access. You get random file access speed less all the overhead of Sqlite's journals and B-trees.

We have an application using storage something like yours and we use memory mapped areas with AVL trees for indexing. If it needed to run as fast as yours we would probably use hashing rather than the binary trees. A sparse index is realized by concatenated keys. This method dynamically uses memory for caching, but is not limited to physical memory size, only virtual memory. It assumes POSIX capabilities from the OS.

With hashing you avoid the overhead of B-tree balancing with insertions, but pay for it by not having the keys accessable in an ordered sequence.

Ask yourself and your users/customers whether the easiest solution or the best solution is the most satisfactory.

Reply via email to