Hi,

On Thu, Nov 22, 2012 at 8:15 PM, Alexander Klimetschek
<aklim...@adobe.com> wrote:
> Why does Jackrabbit/Oak not map JCR hierarchies directly to the filesystem?

As pointed out in the Kafka document, random access over a file system
is terribly inefficient which is why splitting finely grained content
like what you typically see in a content repository to separate files
and directories wouldn't work too well performance-wise with normal
file systems. Doing so would also suffer from the other issues you
mentioned, most notably lack of atomicity or locking.

Instead, and like Kafka also does, storing repository content in big
journal or collection files is a pretty good idea. That's what our
proprietary TarPM does for Jackrabbit 2.x and you could argue that
also the database-backed PMs and Oak MKs are doing the something
similar through the database engine. (Also git does this with its pack
files.) The main difference to the design as outlined in Kafka is that
for various reasons (remote access, etc.) we've had to add various
levels of in-memory caching especially in Jackrabbit 2.x. In Oak we've
tried to avoid extra caches, and so far only had to add one (that we
could perhaps avoid with OAK-468). If we can keep that goal up, and
further optimize JSON processing at the MK level, it should be
possible also for an Oak stack to work as outlined in the Kafka
document.

BR,

Jukka Zitting

Reply via email to