Re: Rv: Why not BerkeleyDB based object store?

Mark Nottingham Wed, 26 Nov 2008 15:56:20 -0800

Just a tangental thought; has there been any investigation intoreducing the amount of write traffic with the existing stores?

E.g., establishing a floor for reference count; if it doesn't have nrefs, don't write to disk? This will impact hit rate, of course, butmay mitigate in situations where disk caching is desirable, butwriting is the bottleneck...



On 26/11/2008, at 9:14 AM, Kinkie wrote:

On Tue, Nov 25, 2008 at 10:23 PM, Pablo Rosatti
<[EMAIL PROTECTED]> wrote:

Amazon uses BerkeleyDB for several critical parts of its website.The Chicago Mercatile Exchange uses BerkeleyDB for backup andrecovery of its trading database. And Google uses BerkeleyDB toprocess Gmail and Google user accounts. Are you sure BerkeleyDB isnot a good idea to replace the Squid filesystems even COSS?


Squid3 uses a modular storage backend system, so you're more than
welcome to try to code it up and see how it compares.
Generally speaking, the needs of a data cache such as squid are very
different from those of a general-purpose backend storage.
Among the other key differences:
- the data in the cache has little or no value.
 it's important to know whether a file was corrupted, but it can
always be thrown away and fetched from the origin server at a
relatively low cost
- workload is mostly writes
 a well-tuned forward proxy will have a hit-rate of roughly 30%,
which means 3 writes for every read on average
- data is stored in incremental chunks

Given these characteristics, a long list of mechanisms database-like
systems have such as journaling, transactions etc. are a  waste of
resources.
COSS is explicitly designed to handle a workload of this kind. I would
not trust any valuable data to it, but it's about as fast as it gets
for a cache.

IMHO BDB might be much more useful as a metadata storage engine, as
those have a very different access pattern than a general-purpose
cache store.
But if I had any time to devote to this, my priority would be in
bringing 3.HEAD COSS up to speed with the work Adrian has done in 2.

--
   /kinkie


--
Mark Nottingham       [EMAIL PROTECTED]

Re: Rv: Why not BerkeleyDB based object store?

Reply via email to