Hi,

[Just throwing an idea around, no active plans for further work on this.]

One of the biggest performance bottlenecks with current repository
implementations is disk speed, especially seek times but also raw data
transfer rate in many cases. To work around those limitations we've in
Jackrabbit used various caching strategies that considerably
complicate the codebase and still have trouble with cache misses and
write-through performance.

As an alternative to such designs, I was thinking of a microkernel
implementation that would keep the *entire* tree structure in memory,
i.e. only use the disk or another backend for binaries and possibly
for periodic backup dumps. Fault tolerance against hardware failures
or other restarts would be achieved by requiring a clustered
deployment where all content is kept as copies on at least three
separate physical servers. Redis (http://redis.io/) is a good example
of the potential performance gains of such a design.

To estimate how much memory such a model would need, I looked at the
average bundle size of a vanilla CQ5 installation. There the average
bundle (i.e. a node with all its properties and child node references)
size is just 251 bytes. Even assuming larger bundles and some level of
storage and index overhead it seems safe to assume up to about 1kB of
memory per node on average. That would allow one to store some 1M
nodes in each 1GB of memory.

Assuming that all content is evenly spread across the cluster in a way
that puts copies of each individual bundle on at least three different
cluster nodes and that each cluster node additionally keeps a large
cache of most frequently accessed content, a large repository with
100+M content nodes could easily run on a twelve-node cluster where
each cluster node has 32GB RAM, a reasonable size for a modern server
(also available from EC2 as m2.2xlarge). A mid-size repository with
10+M content nodes could run on a three- or four-node cluster with
just 16GB RAM per cluster node (or m2.xlarge in EC2).

I believe such a microkernel could set a pretty high bar on
performance! The only major performance limit I foresee is the network
overhead when writing (need to send updates to other cluster nodes)
and during cache misses (need to retrieve data from other nodes), but
the cache misses would only start affecting repositories that go
beyond what fits in memory on a single server (i.e. the mid-size
repository described above wouldn't yet be hit by that limit) and the
write overhead could be amortized by allowing the nodes to temporarily
diverge until they have a chance to sync up again in the background
(as allowed by the MK contract).

BR,

Jukka Zitting

Reply via email to