Jacob Levy wrote:

I've often wondered if a more sparse representation of modified data that
*only* contains the modifications would save memory and also give
acceptable lookup performance. Presumably after modification its likely
that a commit will follow shortly, so then it might make sense to penalize
lookups that happen between modifications and commits. Just a thought.

That is actually already the case: a file is mapped, then when a change is made, MK allocates 4 Kb chunks *only* for those pieces which it needs to change, and continues to use mapped memory for the rest of the column.


Two extra details: 1) insertions and deletions often cause more buffers to be allocated and copied than necessary, even though more could be done with simply moving and re-using the original mapping, and 2) all of this happens per column, i.e. only for those properties which are actually modified. The fly in the ointment is insertion/deletion of rows.

There are lots of tricks one could play to take this further. Commit-aside was created in anticipation of that. The key is to allow fast access by byte offset into such a mix of original-mapped and modified/shifted regions.

There is also another way to do this: introduce a layer at the view level which collects changes and "applies them virtually". This, btw, is where I'm planning to go in the future (yeah, promises, promises, I know...). I even made a web page for it not so long ago, though it was never made public: http://www.equi4.com/mkisolation.html - it contains a few inaccuracies and skims over a few details, but it ought to the get the basic idea across.

-jcw

_____________________________________________
Metakit mailing list  -  [EMAIL PROTECTED]
http://www.equi4.com/mailman/listinfo/metakit

Reply via email to