* Marcin Nowak:
> Recently I've discovered XML database quite similar in general
> concepts to Jackrabbit, in fact it does not provide versioning
> and referencing between nodes but it is really fast as
> I compared it with Jackrabbit, especially in querying and
> importing nodes, question is why Jackrabbit performs so badly in
> comparison to eXist?
You're asking for a troll very obviously, so I won't comment on
it, but there are a few things that are worth to mention:
1. eXist is an XML database, Jackrabbit is not, so you are
comparing two unrelated things. Moreover, even if the query
syntax can look similar, eXist returns XML, whereas JCR returns
Java objects. You need to understand the implications of this,
namely parsing the resulting XML and work with it can quickly
lead to memory and CPU starvation, especially when the query
returns a lot of documents. JCR plays nicely with this, as it
returns an iterator on the data set.
2. Jackrabbit is mostly seen as a Java-API, whereas eXist is a
standalone beast with specific servlets that talk xmlrpc, REST,
and so on mostly accessed using HTTP requests causing an
additional overhead. eXist even has a front-end based on
Cocoon. A *lot* of caching is done on the eXist side, while
with Jackrabbit you will need a second-level cache in your own
code to address that.
3. In my book, eXist is not designed to let you query the whole
database at once, whereas Jackrabbit allows you to return a
sorted subset of documents from the whole repository very
efficiently, by design. Accessing one XML document is very
different from querying the whole database with 10k+ documents.
Play with eXist more than 5 minutes with a serious data set and
you will notice by yourself.
4. Jackrabbit's efficiency at importing nodes depends largely on
the persistence and filesystem implementation you are using.
For example I've seen the BDB storage backend perform 10 times
faster than the XML-file-based one.
5. When you compare two approaches (one XML database, one JCR
repository) for your own usecase, and moreover when you ask for
feedback about your experiments, publish the results of your
benchmarks, be very careful to mention *what* you tested, and
*how*. You also need to mention of course the numeric figures.
Otherwise you're just spreading FUD.
Cheers,
--
Jean-Baptiste Quenot
aka John Banana Qwerty
http://caraldi.com/jbq/