RE: JdbmPartition repair
>> If we are to do that one day, we would rather use LMDB, which is way faster >> than SqlLite, proven, and small. Agree. Looking at the benchmark result http://symas.com/mdb/microbench/, LMDB seems pretty good as well as LevelDB. One question, is it license (the OpenLDAP public license) compatible with ASF 2.0? Regards, Kai -Original Message- From: Emmanuel Lécharny [mailto:elecha...@gmail.com] Sent: Monday, January 25, 2016 11:58 PM To: Apache Directory Developers List <dev@directory.apache.org> Subject: Re: JdbmPartition repair Le 25/01/16 15:27, Zheng, Kai a écrit : > Thanks a lot for the detailed and insightful explanation. I'm not able to > absorb it well because not familiar with the codes. It will serve as very > good materials when someday I need to look into the LDAP things. The details > make me believe it's very necessary to have a strong, mature, industry proven > backend for the LDAP server because the LDAP things are already kinds of > complex enough. We can't combine the LDAP logic with the storage engine, they > need to be separated, developed and tested separately. Looks like Mavibot is > going in this way and sounds good to me. What concerned me is that as we're > lacking enough resources on developing it, it may still take some time to > become mature and robust. Mavibot code base is small : 17 947 SLOCS > But if we leverage some existing engine, then we can focus on the LDAP > stuffs and work on some advanced features, move on a little faster and > have releases like 2.x, 3.x and so on. Sqlite yes is C, but it's > supported on many platforms and Java can use it via JNI; That would be a real pain. Linking som JNDI lib and make it a package is really something we would like to avoid like plague. If we are to do that one day, we would rather use LMDB, which is way faster than SqlLite, proven, and small. > it's a library, can be embedded in an application. You may dislike > JNI, but only a few of APIs are going to be wrapped for the usage, and > actually there're already wonderful wrappers for Java. Like > SnappyJava, the JNI layer along with the library can be bundled within > a jar file and get distributed exactly as a maven module. One thing > I'm not sure is how well the LDAP entries fit with the sql table > model, Bottom line : very badly. Actually, using a SQL backend to store LDAP element is probably the worst possible solution. Simply because LDAP support multi-valued entries, something SAL databases don't support antively. > but I guess there could be pretty much investigations in this direction. The > benefit would be, saving us amounts of developing and debugging time, robust > and high performance, transaction support and easy query. Some thoughts in > case any helps. Thanks. Thanks. We have been evaluation all thos options for more than a decade now :-) OpenLDAP has gone the exact same path, for the exact same reasons.
RE: JdbmPartition repair
Thanks a lot for the detailed and insightful explanation. I'm not able to absorb it well because not familiar with the codes. It will serve as very good materials when someday I need to look into the LDAP things. The details make me believe it's very necessary to have a strong, mature, industry proven backend for the LDAP server because the LDAP things are already kinds of complex enough. We can't combine the LDAP logic with the storage engine, they need to be separated, developed and tested separately. Looks like Mavibot is going in this way and sounds good to me. What concerned me is that as we're lacking enough resources on developing it, it may still take some time to become mature and robust. But if we leverage some existing engine, then we can focus on the LDAP stuffs and work on some advanced features, move on a little faster and have releases like 2.x, 3.x and so on. Sqlite yes is C, but it's supported on many platforms and Java can use it via JNI; it's a library, can be embedded in an application. You may dislike JNI, but only a few of APIs are going to be wrapped for the usage, and actually there're already wonderful wrappers for Java. Like SnappyJava, the JNI layer along with the library can be bundled within a jar file and get distributed exactly as a maven module. One thing I'm not sure is how well the LDAP entries fit with the sql table model, but I guess there could be pretty much investigations in this direction. The benefit would be, saving us amounts of developing and debugging time, robust and high performance, transaction support and easy query. Some thoughts in case any helps. Thanks. Regards, Kai -Original Message- From: Emmanuel Lécharny [mailto:elecha...@gmail.com] Sent: Monday, January 25, 2016 1:32 AM To: Apache Directory Developers List <dev@directory.apache.org> Subject: Re: JdbmPartition repair Le 24/01/16 16:47, Zheng, Kai a écrit : > Thanks Emmanuel for the sync and sharing. The approach looks pretty good, and > is often seen in many mature products. The repair process is triggered when > corruption is found while the server is running, or while restarting with a > specific option? Or the both? If the repairing stuff is not easy to > integrate, maybe a repair tool like the one Kiran worked out sounds good to > use? Or during startup time checking the data/index, if not fine then Java > system launching the tool process for the fixing? Just some thoughts, in case > some useful. The corruption happens in some rare cases, and it's mostly due to some concurrent updates. Let me explain what happens in detail, and sorry of it's a big lengthly, it has to be. We store entries in what we call the MasterTable. Entries are serialized, and each one of them has an associated ID (actually, we are using random UUID). So the master table is containing tuples of <UUID, Entry>. Each index refers to this MasterTable using the entry UUID. Typically, let's say an entry has an ObjectClass person, then the ObjectClass index will have a tuple <ObjectClass, Set> wher ethe set contains all the Entry's UUID of entrues that has the 'person' ObjectClass. We also have one special index, the Rdn index. This one is more complex, because it is used for two things : refering to an entry from a RDN, and also keep a track of the hierarchy. If we have an entry which DN is ou=users,dc=example,dc=com, where dc=exmple,dc=com is the partition's root, then the RDN index will contain two tuples for the full DN : one for the entry itself, and one for the suffix. Actually, we don't store tuples like <Rdn, ID>, but a more complex structure, the ParentIdAndRdn. The reason is that we may have many entries using the same RDN. For instance : entry 1 : cn=jSmith,ou=users,dc=example,dc=com entry 2 : cn=jSmith,ou=administrators,dc=example,dc=com That this jSmith is one person or two is irrelevant. The thing is that we can't associate the RDN cn=jSmith with one single entry, so what we store is a tuple <entryId1,
Re: JdbmPartition repair
Le 25/01/16 15:27, Zheng, Kai a écrit : > Thanks a lot for the detailed and insightful explanation. I'm not able to > absorb it well because not familiar with the codes. It will serve as very > good materials when someday I need to look into the LDAP things. The details > make me believe it's very necessary to have a strong, mature, industry proven > backend for the LDAP server because the LDAP things are already kinds of > complex enough. We can't combine the LDAP logic with the storage engine, they > need to be separated, developed and tested separately. Looks like Mavibot is > going in this way and sounds good to me. What concerned me is that as we're > lacking enough resources on developing it, it may still take some time to > become mature and robust. Mavibot code base is small : 17 947 SLOCS > But if we leverage some existing engine, then we can focus on the LDAP stuffs > and work on some advanced features, move on a little faster and have releases > like 2.x, 3.x and so on. Sqlite yes is C, but it's supported on many > platforms and Java can use it via JNI; That would be a real pain. Linking som JNDI lib and make it a package is really something we would like to avoid like plague. If we are to do that one day, we would rather use LMDB, which is way faster than SqlLite, proven, and small. > it's a library, can be embedded in an application. You may dislike JNI, but > only a few of APIs are going to be wrapped for the usage, and actually > there're already wonderful wrappers for Java. Like SnappyJava, the JNI layer > along with the library can be bundled within a jar file and get distributed > exactly as a maven module. One thing I'm not sure is how well the LDAP > entries fit with the sql table model, Bottom line : very badly. Actually, using a SQL backend to store LDAP element is probably the worst possible solution. Simply because LDAP support multi-valued entries, something SAL databases don't support antively. > but I guess there could be pretty much investigations in this direction. The > benefit would be, saving us amounts of developing and debugging time, robust > and high performance, transaction support and easy query. Some thoughts in > case any helps. Thanks. Thanks. We have been evaluation all thos options for more than a decade now :-) OpenLDAP has gone the exact same path, for the exact same reasons.
Re: JdbmPartition repair
Emmanuel, thanks for keeping us informed. I agree that corruption of data is a show stopper in terms of a product’s viability. Can we recreate this issue or is it intermittent? How can we help? Shawn > On Jan 24, 2016, at 7:39 AM, Emmanuel Lécharnywrote: > > Hi guys, > > we have many users complaining about a corrupted JDBM database. As of > today, we don't have another solution than telling them to reload their > data, which is all be comfortable. First because it might take ages > (reloading data is very slow) and also because they might not have a backup. > > Although this is not a frequent scenario, when it happens, it really > take down any credibility ApacheDS can have. > > Here, we all know that Mavibot will be the solution, but until it's > available with transaction support, we have to propose a tool that > restore - of possible - the database. > > Hopefully, Kiran has worked on a tool that does that : the > partition-plumber. The idea is to intergrate this tool into ApacheDS in > order to allow users to restore their database in a simple way. Here is > what I propose : > > - first, a way to start ApacheDS in a repair mode. That will drip all > the indexes, and recreate them based on the master table. It might take > some time, but it will be way better than any solution, and in any case, > will be faster than a full reload, consider that we will bypass many > checks. I suggest an option : apacheds -repair. When the server is > started with that option, the server will restart after having cleaned > up the database > - the way to implement it is to add a method in the Partition interface > : repair(). Not all the partition will need it, so only JdbmPartition > will actually iumplement it. > - that method will simply delete (or copy, if we want a backup) all the > existing indexes (system and users). We then will recreate the indexes > based on the master table content. There is still a remote risk that the > master table can be corrupted, but it's unlikely, or at least very rare. > Actually, the Rdn index is the one which get corrupted most of the time, > because it get updated many times for each addition, move, rename or > delete operations. > > I'm currently working on that, and it should be done fast enough (say, > in less than a week, or even quicker if I have enough time this sundy > and at night). > > The next step, and I'm also working on that, is to finish Mavibot. The > problem is that it's a complex piece of code, and it's hard to work on > it when I just have a couple of hours on evening or during the week-end. > I'm sorry for that. But we eventually will get it ready ! > > > Thanks ! >
Re: JdbmPartition repair
Le 24/01/16 16:47, Zheng, Kai a écrit : > Thanks Emmanuel for the sync and sharing. The approach looks pretty good, and > is often seen in many mature products. The repair process is triggered when > corruption is found while the server is running, or while restarting with a > specific option? Or the both? If the repairing stuff is not easy to > integrate, maybe a repair tool like the one Kiran worked out sounds good to > use? Or during startup time checking the data/index, if not fine then Java > system launching the tool process for the fixing? Just some thoughts, in case > some useful. The corruption happens in some rare cases, and it's mostly due to some concurrent updates. Let me explain what happens in detail, and sorry of it's a big lengthly, it has to be. We store entries in what we call the MasterTable. Entries are serialized, and each one of them has an associated ID (actually, we are using random UUID). So the master table is containing tuples of. Each index refers to this MasterTable using the entry UUID. Typically, let's say an entry has an ObjectClass person, then the ObjectClass index will have a tuple wher ethe set contains all the Entry's UUID of entrues that has the 'person' ObjectClass. We also have one special index, the Rdn index. This one is more complex, because it is used for two things : refering to an entry from a RDN, and also keep a track of the hierarchy. If we have an entry which DN is ou=users,dc=example,dc=com, where dc=exmple,dc=com is the partition's root, then the RDN index will contain two tuples for the full DN : one for the entry itself, and one for the suffix. Actually, we don't store tuples like , but a more complex structure, the ParentIdAndRdn. The reason is that we may have many entries using the same RDN. For instance : entry 1 : cn=jSmith,ou=users,dc=example,dc=com entry 2 : cn=jSmith,ou=administrators,dc=example,dc=com That this jSmith is one person or two is irrelevant. The thing is that we can't associate the RDN cn=jSmith with one single entry, so what we store is a tuple
RE: JdbmPartition repair
Thanks Emmanuel for the sync and sharing. The approach looks pretty good, and is often seen in many mature products. The repair process is triggered when corruption is found while the server is running, or while restarting with a specific option? Or the both? If the repairing stuff is not easy to integrate, maybe a repair tool like the one Kiran worked out sounds good to use? Or during startup time checking the data/index, if not fine then Java system launching the tool process for the fixing? Just some thoughts, in case some useful. I'm not very sure to rewrite JDBM though I know there're plenty of reasons to do so, as most software rewritings do. But if we have start with new, implementing something like B+ tree, that needs transaction support, I'm wondering if we could do it by leveraging already industry proven backend, because developing such backend may take long time effort and pretty much of resources. I'm wondering if Sqlite could serve the purpose well or not, or how it can be wrapped or adapted for the usage here. Again just a quick thought and in case somewhat useful. Regards, Kai -Original Message- From: Emmanuel Lécharny [mailto:elecha...@gmail.com] Sent: Sunday, January 24, 2016 9:40 PM To: Apache Directory Developers ListSubject: JdbmPartition repair Hi guys, we have many users complaining about a corrupted JDBM database. As of today, we don't have another solution than telling them to reload their data, which is all be comfortable. First because it might take ages (reloading data is very slow) and also because they might not have a backup. Although this is not a frequent scenario, when it happens, it really take down any credibility ApacheDS can have. Here, we all know that Mavibot will be the solution, but until it's available with transaction support, we have to propose a tool that restore - of possible - the database. Hopefully, Kiran has worked on a tool that does that : the partition-plumber. The idea is to intergrate this tool into ApacheDS in order to allow users to restore their database in a simple way. Here is what I propose : - first, a way to start ApacheDS in a repair mode. That will drip all the indexes, and recreate them based on the master table. It might take some time, but it will be way better than any solution, and in any case, will be faster than a full reload, consider that we will bypass many checks. I suggest an option : apacheds -repair. When the server is started with that option, the server will restart after having cleaned up the database - the way to implement it is to add a method in the Partition interface : repair(). Not all the partition will need it, so only JdbmPartition will actually iumplement it. - that method will simply delete (or copy, if we want a backup) all the existing indexes (system and users). We then will recreate the indexes based on the master table content. There is still a remote risk that the master table can be corrupted, but it's unlikely, or at least very rare. Actually, the Rdn index is the one which get corrupted most of the time, because it get updated many times for each addition, move, rename or delete operations. I'm currently working on that, and it should be done fast enough (say, in less than a week, or even quicker if I have enough time this sundy and at night). The next step, and I'm also working on that, is to finish Mavibot. The problem is that it's a complex piece of code, and it's hard to work on it when I just have a couple of hours on evening or during the week-end. I'm sorry for that. But we eventually will get it ready ! Thanks !
Re: JdbmPartition repair
Le 24/01/16 15:07, Shawn McKinney a écrit : > Emmanuel, thanks for keeping us informed. I agree that corruption of data is > a show stopper in terms of a product’s viability. Can we recreate this issue > or is it intermittent? How can we help? It's hard to reproduce the issue, as it's really depending on many random conditions...