O.K.  I think I remember why I stopped looking at the SearchScope.  If I just 
change the scope to either ONELEVEL or OBJECT, the cursor comes back empty.

Is this an AD "feature" or is this a bug in the API?

It looks like I'll need to do 1 query using the SUBTREE scope.

 - Chris

-----Original Message-----
From: Harris, Christopher P 
Sent: Tuesday, January 27, 2015 6:16 PM
To: [email protected]
Subject: RE: Proper use of LdapConnectionPool

Ah, crap.  I forgot to look at the Scope.  I've been using this code for so 
long for single-search queries that I took it for granted.

I'll try setting it to ONE_LEVEL to simply bask in the glory of the speedy 
results, but still, doing just 1 query makes a hell of a lot more sense.

Sorry, I don't know where my head was.

Thanks for steering me down the right path.

 - Chris


-----Original Message-----
From: Emmanuel Lécharny [mailto:[email protected]]
Sent: Tuesday, January 27, 2015 5:43 PM
To: [email protected]
Subject: Re: Proper use of LdapConnectionPool

Le 27/01/15 23:07, Harris, Christopher P a écrit :
> Hi, Emmanuel.
>
> "Can you tell us how you do that ? Ie, are you using a plain new connection 
> for each thread you spawn ?"
> Sure.  I can tell you how I am implementing a multi-threaded approach to read 
> all of LDAP/AD into memory.  I'll do the next best thing...paste my code at 
> the end of my response.
>
>
> "In any case, the TimeOut is the default LDapConnection timeout (30 seconds) 
> :"
> Yes, I noticed mention of the default timeout in your User Guide.
>
>
> "You have to set the LdapConnectionConfig timeout for all the created 
> connections to use it. there is a setTimeout() method for that which has been 
> added in 1.0.0-M28."
> When visiting your site while seeking to explore connection pool options, I 
> noticed that you recently released M28 and fixed DIRAPI-217 and decided to 
> update my pom.xml to M28 and test out the PoolableLdapConnectionFactory.  
> Great job, btw.  Keep up the good work!
>
> Oh, and your example needs to be updated to using 
> DefaultPoolableLdapConnectionFactory instead of PoolableLdapConnectionFactory.
>
>
> "config.setTimeOut( whatever fits you );"
> Very good to know.  Thank you!
>
>
> "It is the right way."
> Sweeeeeeet!
>
>
> "Side note : you may face various problems when pulling everything 
> from an AD server. Typically, the AD config might not let you pull 
> more than
> 1000 entries, as there is a hard limit you need to change on AD if you want 
> to get more entries.
>
> Otherwise, the approach - ie, using multiple threads - might seems good, but 
> the benefit is limited. Pulling entries from the server is fast, you should 
> be able to get tens of thousands per second with one single thread. I'm not 
> sure how AD support concurrent searches anyway. Last, not least, it's likely 
> that AD does not allow more than a certain number of concurrent threads to 
> run, which might lead to contention at some point."
>
> Ah, this is why I wanted to reach out to you guys.  You guys know this kind 
> of in-depth information about LDAP and AD.  So, I may adapt my code to a 
> single-thread then.  I can live with that.  I need to pull about 40k-60k 
> entries, so 10's of thousands of entries per second works for me.  I may need 
> to run the code by you then if I go with a single-threaded approach and need 
> to check if I'm going about it in the most efficient manner.

The pb with the multi-threaded approach is that you *have* to know which entry 
has children, because they won't give you such an info. So you will end doing a 
search for every single entry you get at one level, with scope ONE_LEVEL, and 
most of the time, you will just get teh entry itself. That would more than 
double the time it takes to grab everything.

>
>
>
> And now time for some code...
>
> import java.io.IOException;
> import java.util.Iterator;
> import java.util.List;
> import java.util.Map;
> import java.util.concurrent.ConcurrentHashMap;
> import java.util.concurrent.ExecutorService;
> import java.util.concurrent.Executors; import 
> java.util.concurrent.TimeUnit; import java.util.logging.Level; import 
> java.util.logging.Logger;
>
> import org.apache.commons.pool.impl.GenericObjectPool;
> import org.apache.directory.api.ldap.model.cursor.CursorException;
> import org.apache.directory.api.ldap.model.cursor.SearchCursor;
> import org.apache.directory.api.ldap.model.entry.Entry;
> import org.apache.directory.api.ldap.model.exception.LdapException;
> import org.apache.directory.api.ldap.model.message.Response;
> import org.apache.directory.api.ldap.model.message.SearchRequest;
> import org.apache.directory.api.ldap.model.message.SearchRequestImpl;
> import org.apache.directory.api.ldap.model.message.SearchResultEntry;
> import org.apache.directory.api.ldap.model.message.SearchScope;
> import org.apache.directory.api.ldap.model.name.Dn;
> import
> org.apache.directory.ldap.client.api.DefaultLdapConnectionFactory;
> import org.apache.directory.ldap.client.api.LdapConnection;
> import org.apache.directory.ldap.client.api.LdapConnectionConfig;
> import org.apache.directory.ldap.client.api.LdapConnectionPool;
> import org.apache.directory.ldap.client.api.LdapNetworkConnection;
> import
> org.apache.directory.ldap.client.api.DefaultPoolableLdapConnectionFact
> ory; import
> org.apache.directory.ldap.client.api.ValidatingPoolableLdapConnectionF
> actory; import org.apache.directory.ldap.client.api.SearchCursorImpl;
> import org.apache.directory.ldap.client.template.EntryMapper;
> import
> org.apache.directory.ldap.client.template.LdapConnectionTemplate;
>
> /**
>  * @author Chris Harris
>  *
>  */
> public class LdapClient {
>               
>       public LdapClient() {
>               
>       }
>                       
>       public Person searchLdapForCeo() {
>               return this.searchLdapUsingHybridApproach(ceoQuery);
>       }
>       
>       public Map<String, Person> buildLdapMap() {
>               SearchCursor cursor = new SearchCursorImpl(null, 300000, 
> TimeUnit.SECONDS);
>               LdapConnection connection = new LdapNetworkConnection(host, 
> port);
>               connection.setTimeOut(300000);
>               Entry entry = null;
>               
>               try {
>                       connection.bind(dn, pwd);
>                               
> LdapClient.recursivelyGetLdapDirectReports(connection, cursor, entry, 
> ceoQuery);
>                               System.out.println("Finished all Ldap Map 
> Builder threads...");
>                       } catch (LdapException ex) {
>                               
> Logger.getLogger(LdapClient.class.getName()).log(Level.SEVERE, null, ex);
>                       } catch (CursorException ex) {
>                               
> Logger.getLogger(LdapClient.class.getName()).log(Level.SEVERE, null, ex);
>                       } finally {
>                               cursor.close();
>                                try {
>                                       connection.close();
>                               } catch (IOException ex) {
>                                       
> Logger.getLogger(LdapClient.class.getName()).log(Level.SEVERE, null, ex);
>                               }
>                       }
>               
>               return concurrentPersonMap;
>       }
>       
>       private static Person recursivelyGetLdapDirectReports(LdapConnection 
> connection, SearchCursor cursor, Entry entry, String query) 
>                       throws CursorException {
>               Person p = null;
>                       EntryMapper<Person> em = Person.getEntryMapper();
>         
>               try {           
>                               SearchRequest sr = new SearchRequestImpl();
>                               sr.setBase(new Dn(searchBase));
>                               StringBuilder sb = new StringBuilder(query);
>                               sr.setFilter(sb.toString());
>                               sr.setScope( SearchScope.SUBTREE );

Ahhhhh !!!! STOP !!!

Ok, no need to go any further in your code.

You are doing a SUBTREE search on *every single entry* you are pulling from the 
base. if you have 40 000 entries, you will do something like O(
40 000! ) (factorial) searches. No wonder why you get timeout... Imagine you 
have such a tree :

root
  A1
    B1
      C1
      C2
    B2
      C3
      C4
  A2
    B3
      C5
      C6
    B4
      C7
      C8

The search on root with pull A1, A2, B1, B2, B3, B4, C1..8 (14 entries
-> 14 searches)
Then the search on A1 will pull B1, C1, C2, B2, C3, C4 (6 entries -> 6
searches)
Then the search on A2 will pull B3, C5, C6, B7, C8, C9 (6 entries -> 6
searches)
Then the search on B1 will pull C1, C2 ( 2 entries -> 2 searches, *4 = 8 ...

At the end, you have done 1 + 14 + 12 + 8 = 35 searches, when you have only 15 
entries...

If you want to see what your algorithm is doing, just do a search using a 
SearchScope.ONE_LEVEL instead. You will only do somehow O(40 000) searches, 
which is way less than what you are doing.

But anyway, doing a search on the root with a SUBTREE scope will be way faster, 
because you will do only one single search.


The information transmitted is intended only for the person(s) or entity to 
which it is addressed and may contain confidential and/or legally privileged 
material. Delivery of this message to any person other than the intended 
recipient(s) is not intended in any way to waive privilege or confidentiality. 
Any review, retransmission, dissemination or other use of, or taking of any 
action in reliance upon, this information by entities other than the intended 
recipient is prohibited. If you receive this in error, please contact the 
sender and delete the material from any computer.

For Translation:

http://www.baxter.com/email_disclaimer
The information transmitted is intended only for the person(s) or entity to 
which it is addressed and may contain confidential and/or legally privileged 
material. Delivery of this message to any person other than the intended 
recipient(s) is not intended in any way to waive privilege or confidentiality. 
Any review, retransmission, dissemination or other use of, or taking of any 
action in reliance upon, this information by entities other than the intended 
recipient is prohibited. If you receive this in error, please contact the 
sender and delete the material from any computer.

For Translation:

http://www.baxter.com/email_disclaimer

Reply via email to