On 4/28/07, Miro Walker <[EMAIL PROTECTED]> wrote:
Alex,

On 4/25/07, Alexandru Popescu ☀ <[EMAIL PROTECTED]> wrote:
> I may be misreading something, but my main concern with this approach
> is that while minimizing the size of the storage (which is very cheap
> right now and almost infinite) it has a penalty on the access
> performance: needing 2 "I/O" operations for reading a value. The
> caching strategy may address this problem, but even if memory is also
> cheap it is still limitted. So, while I see this solution fit for
> cases where huge amounts of duplicate data would be stored, for all
> the other cases I see it as suboptimal.

Hm - not sure I agree with the assumption that storage is
cheap/infinite. Try dealing with backups / etc on a repository that is
50GB in size, then try with 100GB+ - it gets to be a major headache.
Even with lots of bandwidth, copying 100GB over a WAN can do all sorts
of nasty things, like crash firewalls, etc. With a versioning
repository using multiple workspaces, disk space usage can grow
extremely fast and we're finding we have many GB of data, 90%+ of
which is duplicates. Something like what Jukka is suggesting would
help enourmously. I guess it's one of those "depends on the use case"
things :-)


Miro, I do agree with your points. And I do agree with the "depends on
the usa case". Unfortunately, I feel that the current approach may
address this "possible" concern (for which there are some possible
solutions), while raising others for which there is no available
solution (bad performance, possible concurrency bottlenecks). By all
means, I am not saying that this is a totally wrong approach, but I
think there are missing aspects that should be considered upfront then
later.

bests,

./alex
--
.w( the_mindstorm )p.

miro

Reply via email to