Alex,

On 4/25/07, Alexandru Popescu ☀ <[EMAIL PROTECTED]> wrote:
I may be misreading something, but my main concern with this approach
is that while minimizing the size of the storage (which is very cheap
right now and almost infinite) it has a penalty on the access
performance: needing 2 "I/O" operations for reading a value. The
caching strategy may address this problem, but even if memory is also
cheap it is still limitted. So, while I see this solution fit for
cases where huge amounts of duplicate data would be stored, for all
the other cases I see it as suboptimal.

Hm - not sure I agree with the assumption that storage is
cheap/infinite. Try dealing with backups / etc on a repository that is
50GB in size, then try with 100GB+ - it gets to be a major headache.
Even with lots of bandwidth, copying 100GB over a WAN can do all sorts
of nasty things, like crash firewalls, etc. With a versioning
repository using multiple workspaces, disk space usage can grow
extremely fast and we're finding we have many GB of data, 90%+ of
which is duplicates. Something like what Jukka is suggesting would
help enourmously. I guess it's one of those "depends on the use case"
things :-)

miro

Reply via email to