Hi,

On 4/25/07, Alexandru Popescu ☀ <[EMAIL PROTECTED]> wrote:
On 4/23/07, Jukka Zitting <[EMAIL PROTECTED]> wrote:
> My idea is to store each value in a unique and immutable "value
> record" identified by a "value identifier". Duplicate values are only
> stored once in a single value record. This saves space especially when
> storing multiple copies of large binary documents and allows value
> equality comparisons based on just the identifiers.
> [...]

I may be misreading something, but my main concern with this approach
is that while minimizing the size of the storage (which is very cheap
right now and almost infinite) it has a penalty on the access
performance: needing 2 "I/O" operations for reading a value. The
caching strategy may address this problem, but even if memory is also
cheap it is still limitted. So, while I see this solution fit for
cases where huge amounts of duplicate data would be stored, for all
the other cases I see it as suboptimal.

Good point. Apart from the space savings my main goal was to have
short constant-length identifiers that could be used for equality
comparisons instead of comparing the value contents. This would be
especially beneficial for things like names and paths and probably
also other medium-length strings, but I agree that the
locality-of-access issue should be resolved somehow.

BR,

Jukka Zitting

Reply via email to