Thanks so much for the full explanation, Emmanuel. Sorry for my asking again. 

>> Otherwise, we could use LMDB, with a JNI wrapper. That is an option. But I 
>> have no idea what it would cost us in term of packaging.
The packaging isn't really so much a problem. There is a nice way handling 
this: the binary package could be put into the jar just as a resource and while 
starting up it can be dynamically extracted, putted into a runtime folder to be 
loaded. Many Java projects do this way. 

If we could go this way, I would love to do this and support in other aspects 
as well as possible, because it would sound very promising to come up a high 
performance and high reliability LDAP server for the Apache world.

Regards,
Kai

-----Original Message-----
From: Emmanuel Lécharny [mailto:elecha...@gmail.com] 
Sent: Monday, June 27, 2016 6:03 PM
To: Apache Directory Developers List <dev@directory.apache.org>
Subject: Re: Rethinking Mavibot...

Le 27/06/16 à 08:07, Zheng, Kai a écrit :
> Thanks for the update.
>
> It looks like to me there is much work to do. Is there any alternative 
> option? I'm still thinking that if we could leverage any existing back end 
> implementation, so we could focus on the LDAP specific logic for the master 
> server component...this is worth being considered because in today's industry 
> there are so many B-TREE's implementations already.
I think we already discussed that matter months ago. I also think that many 
don't understand why we *need* somthing like Mavibot. But let me try to explain 
again...


Back in 2006, we knew that we were going to have troubles with our choice 
(JDBM). Back then, we had little choice though :
- JDBM was the only open source, license compatible B-tree implementation in 
Java available.
- We had other more important issues to cope with.

However, during the Austin Apache Conference, during which CiuchDB was 
announced, we had a long discussion with Alex, Pierre-Arnaud and Ersin about 
the fact that we would need a MVCC based backend. Sadly, CouchDB was written in 
Erlang, so we had to wait.

We waited until 2011, where it appears that concurrent searches and updates 
would eventually generate errors (typically, some searches would fail). We 
added a hell lot of locks, up to the point it was impossible to do a search 
while doing an update, which was a very expensive penalty to pay. At teh same 
time, we started to look at alternatives, that does not include a rewite. Some 
guy started to implement MVCC on top of JDBM, but the result was not pleasant : 
if for any reason you forgot to close a cursor, the server would go west in a 
matter of minute. We can't forces the client to carrefully close their cursor, 
it was simply not an option, so we ditched the work.

What alternative did we have ? Not so much : Berkeley DB has been bought by 
Oracle, and the JE wasn't available with a compatible license. And as of today, 
there aren't any MVCC B-Tree implementation that I know of, with a compatible 
license. So we are in a kind of dead lock.

Funny enough, at the very same time, OpenLDAP has started to work on the exact 
same piece of code, for the exact same reason (BDB has changed its license, and 
some data corruption could occur under certain circonstances, requiering a tool 
to repair the database). So we new we weren't in bad company !

Bottom line, I started to work on a replacement for JDBM, which get pushed in 
the repository on january 2012 ( I started to work on that in the mid 2011). 
Kiran ported ApacheDS to use Mavibot as a backend around Srping 2013, and we 
now have an ApacheDS server that *works* with Mavibot. Not only that, but it's 
also faster than JDBM.

Is it enough ? No. For one single reason : Mavibot with no transaction support 
won't be any better than JDBM, for the exact same reasons : if we have a crash, 
we will potentially ends with a corrupted database (less often than JDBM but 
still). It's way better though because we can't have a failure during a search 
while updates are done, and courruption could be fixed easily.

Mavibot brings some other extra bonuses : we now can inject data in bulk mode, 
which is orders of magnitude faster than adding data when the server is up and 
running.


Otherwise, we could use LMDB, with a JNI wrapper. That is an option. But I have 
no idea what it would cost us in term of packaging. Right now, ApacheDS comes 
as a bundled package, or as an installer for Linux, Mac OSX and Windows. Having 
a dependency on a binary component might be a real trouble when it comes to 
package it properly. ATM, I'm not willing to spend some time on this aspect.

Last, not least, Mavibot is *NOT* a B-tree implementation. It's a MVCC 
(Multi-Version Concurrency Control B-tree implementation
(https://en.wikipedia.org/wiki/Multiversion_concurrency_control) which is 
*VERY* different. The critical aspect is the MVCC part, this is what guarantees 
consistancy, and lock free access to the underlying database.

What we are lacking atm, is the cross B-trees transaction support. This is what 
will bring two critical improvements to the ApacheDS server :

1) No need to implement a mechanism to restore a database if it crashes in the 
middle of an update (a LDAP update requires multiple updates to multiple 
indexes - typically 10 minimum, with some indexes being updated more than once, 
like the RDN index -).
2) Speed ! During a transaction, we work in memory, until we are done (with a 
commit or an abort). That saves us multiple updates on disk.
Typically, we would save 50% of the writes for a single Add operation.
That means an Add would be twice faster.


I hope this clarify the reason why we started to develop Mavibot, even though 
it's not going as fast as it should (well, at some point, we have a life, and a 
day job, that both don't let us work as much as we would like on our favorite 
project).

I would end by telling everyone that this is an Open Source project, and anyone 
is greatly welcome if they want to give an hand...

Thanks for 'listening'.

Reply via email to