Re: Proper use of LdapConnectionPool

Emmanuel Lécharny Tue, 27 Jan 2015 15:45:06 -0800

Le 27/01/15 23:07, Harris, Christopher P a écrit :
> Hi, Emmanuel.
>
> "Can you tell us how you do that ? Ie, are you using a plain new connection 
> for each thread you spawn ?"
> Sure.  I can tell you how I am implementing a multi-threaded approach to read 
> all of LDAP/AD into memory.  I'll do the next best thing...paste my code at 
> the end of my response.
>
>
> "In any case, the TimeOut is the default LDapConnection timeout (30 seconds) 
> :"
> Yes, I noticed mention of the default timeout in your User Guide.
>
>
> "You have to set the LdapConnectionConfig timeout for all the created 
> connections to use it. there is a setTimeout() method for that which has been 
> added in 1.0.0-M28."
> When visiting your site while seeking to explore connection pool options, I 
> noticed that you recently released M28 and fixed DIRAPI-217 and decided to 
> update my pom.xml to M28 and test out the PoolableLdapConnectionFactory.  
> Great job, btw.  Keep up the good work!
>
> Oh, and your example needs to be updated to using 
> DefaultPoolableLdapConnectionFactory instead of PoolableLdapConnectionFactory.
>
>
> "config.setTimeOut( whatever fits you );"
> Very good to know.  Thank you!
>
>
> "It is the right way."
> Sweeeeeeet!
>
>
> "Side note : you may face various problems when pulling everything from an AD 
> server. Typically, the AD config might not let you pull more than
> 1000 entries, as there is a hard limit you need to change on AD if you want 
> to get more entries.
>
> Otherwise, the approach - ie, using multiple threads - might seems good, but 
> the benefit is limited. Pulling entries from the server is fast, you should 
> be able to get tens of thousands per second with one single thread. I'm not 
> sure how AD support concurrent searches anyway. Last, not least, it's likely 
> that AD does not allow more than a certain number of concurrent threads to 
> run, which might lead to contention at some point."
>
> Ah, this is why I wanted to reach out to you guys.  You guys know this kind 
> of in-depth information about LDAP and AD.  So, I may adapt my code to a 
> single-thread then.  I can live with that.  I need to pull about 40k-60k 
> entries, so 10's of thousands of entries per second works for me.  I may need 
> to run the code by you then if I go with a single-threaded approach and need 
> to check if I'm going about it in the most efficient manner.


The pb with the multi-threaded approach is that you *have* to know which
entry has children, because they won't give you such an info. So you
will end doing a search for every single entry you get at one level,
with scope ONE_LEVEL, and most of the time, you will just get teh entry
itself. That would more than double the time it takes to grab everything.

>
>
>
> And now time for some code...
>
> import java.io.IOException;
> import java.util.Iterator;
> import java.util.List;
> import java.util.Map;
> import java.util.concurrent.ConcurrentHashMap;
> import java.util.concurrent.ExecutorService;
> import java.util.concurrent.Executors;
> import java.util.concurrent.TimeUnit;
> import java.util.logging.Level;
> import java.util.logging.Logger;
>
> import org.apache.commons.pool.impl.GenericObjectPool;
> import org.apache.directory.api.ldap.model.cursor.CursorException;
> import org.apache.directory.api.ldap.model.cursor.SearchCursor;
> import org.apache.directory.api.ldap.model.entry.Entry;
> import org.apache.directory.api.ldap.model.exception.LdapException;
> import org.apache.directory.api.ldap.model.message.Response;
> import org.apache.directory.api.ldap.model.message.SearchRequest;
> import org.apache.directory.api.ldap.model.message.SearchRequestImpl;
> import org.apache.directory.api.ldap.model.message.SearchResultEntry;
> import org.apache.directory.api.ldap.model.message.SearchScope;
> import org.apache.directory.api.ldap.model.name.Dn;
> import org.apache.directory.ldap.client.api.DefaultLdapConnectionFactory;
> import org.apache.directory.ldap.client.api.LdapConnection;
> import org.apache.directory.ldap.client.api.LdapConnectionConfig;
> import org.apache.directory.ldap.client.api.LdapConnectionPool;
> import org.apache.directory.ldap.client.api.LdapNetworkConnection;
> import 
> org.apache.directory.ldap.client.api.DefaultPoolableLdapConnectionFactory;
> import 
> org.apache.directory.ldap.client.api.ValidatingPoolableLdapConnectionFactory;
> import org.apache.directory.ldap.client.api.SearchCursorImpl;
> import org.apache.directory.ldap.client.template.EntryMapper;
> import org.apache.directory.ldap.client.template.LdapConnectionTemplate;
>
> /**
>  * @author Chris Harris
>  *
>  */
> public class LdapClient {
>               
>       public LdapClient() {
>               
>       }
>                       
>       public Person searchLdapForCeo() {
>               return this.searchLdapUsingHybridApproach(ceoQuery);
>       }
>       
>       public Map<String, Person> buildLdapMap() {
>               SearchCursor cursor = new SearchCursorImpl(null, 300000, 
> TimeUnit.SECONDS);
>               LdapConnection connection = new LdapNetworkConnection(host, 
> port);
>               connection.setTimeOut(300000);
>               Entry entry = null;
>               
>               try {
>                       connection.bind(dn, pwd);
>                               
> LdapClient.recursivelyGetLdapDirectReports(connection, cursor, entry, 
> ceoQuery);
>                               System.out.println("Finished all Ldap Map 
> Builder threads...");
>                       } catch (LdapException ex) {
>                               
> Logger.getLogger(LdapClient.class.getName()).log(Level.SEVERE, null, ex);
>                       } catch (CursorException ex) {
>                               
> Logger.getLogger(LdapClient.class.getName()).log(Level.SEVERE, null, ex);
>                       } finally {
>                               cursor.close();
>                                try {
>                                       connection.close();
>                               } catch (IOException ex) {
>                                       
> Logger.getLogger(LdapClient.class.getName()).log(Level.SEVERE, null, ex);
>                               }
>                       }
>               
>               return concurrentPersonMap;
>       }
>       
>       private static Person recursivelyGetLdapDirectReports(LdapConnection 
> connection, SearchCursor cursor, Entry entry, String query) 
>                       throws CursorException {
>               Person p = null;
>                       EntryMapper<Person> em = Person.getEntryMapper();
>         
>               try {           
>                               SearchRequest sr = new SearchRequestImpl();
>                               sr.setBase(new Dn(searchBase));
>                               StringBuilder sb = new StringBuilder(query);
>                               sr.setFilter(sb.toString());
>                               sr.setScope( SearchScope.SUBTREE );

Ahhhhh !!!! STOP !!!

Ok, no need to go any further in your code.

You are doing a SUBTREE search on *every single entry* you are pulling
from the base. if you have 40 000 entries, you will do something like O(
40 000! ) (factorial) searches. No wonder why you get timeout... Imagine
you have such a tree :

root
  A1
    B1
      C1
      C2
    B2
      C3
      C4
  A2
    B3
      C5
      C6
    B4
      C7
      C8

The search on root with pull A1, A2, B1, B2, B3, B4, C1..8 (14 entries
-> 14 searches)
Then the search on A1 will pull B1, C1, C2, B2, C3, C4 (6 entries -> 6
searches)
Then the search on A2 will pull B3, C5, C6, B7, C8, C9 (6 entries -> 6
searches)
Then the search on B1 will pull C1, C2 ( 2 entries -> 2 searches, *4 = 8
...

At the end, you have done 1 + 14 + 12 + 8 = 35 searches, when you have
only 15 entries...

If you want to see what your algorithm is doing, just do a search using
a SearchScope.ONE_LEVEL instead. You will only do somehow O(40 000)
searches, which is way less than what you are doing.

But anyway, doing a search on the root with a SUBTREE scope will be way
faster, because you will do only one single search.

Re: Proper use of LdapConnectionPool

Reply via email to