I am in agreement with Selcuk's analysis. I did not presume just how nasty the inconsistency handling would get.
On Thu, May 10, 2012 at 8:18 PM, Selcuk AYA <[email protected]> wrote: > On Thu, May 10, 2012 at 5:51 AM, Emmanuel Lécharny <[email protected]> > wrote: > > Le 5/10/12 9:58 AM, Emmanuel Lécharny a écrit : > > > >> Le 5/10/12 7:57 AM, Selcuk AYA a écrit : > >>> > >>> The problem seems to be caused by the test > >>> testPagedSearchWrongCookie(). This tests failure in pages search by > >>> sending a bad cookie. After failing, it relies on ctx.close() to > >>> cleanup the session. Cleanup of the session will close all the cursors > >>> related to paged searches through the session. > >>> > >>> It seems that somehow ctx.close does not result in an unbind message > >>> at the server side time to time. I do not know what causes this but > >>> this leaves a cursor open(specifically a NoDups cursor on rdn index). > >>> Eventually as changes happen to the Rdn index, we run out of freeable > >>> cache headers. After ignoring this test, pagedsearchit and searchit > >>> pass fine together. It would be good to understand why arrival of > >>> unbind message is a hit and miss case in this test. > >> > >> > >> It's absolutly strange... Neither an UnbindRequest nor an AbandonRequest > >> is sent by JNDI when closing the context, which is a huge bug. > >> > >> I have checked the other tests, and an Ubind request is always sent when > >> we close teh context, except when we get an UnwillingToPerform > exception. > >> It seems like the context is in a state where it considers that no > unbind > >> should be send after an exception. Although I can do a lookup (and get > back > >> the correct response from the server after this excption), the > connection is > >> still borked :/ > >> > >> I'll try to rewite the test using our API to see if it works better, and > >> investigate with som Sun guys to see if there is an issue in JNDI. > >> > >> > >> > > Ok, we have had a long discussion with Alex about this problem... > > > > The thing is that even for standard PagedSearch, where everything goes > fine > > (ie, when the client is done, he has correctly closed the connextion, > which > > sends a UbindRequest, which close the cursor etc), we may have dozens of > > opened cursors for some extend period of time. > > > > At some point, we may have a exhausted cache, with no way to evict any > > elements from it, leading to a server freeze. > > > > Not something we can accept from a LDAP server... > > > > A suggestion would be to add some parameter in the OperationContext > telling > > the underlying layer that a search is done outside of any transaction. > When > > we fetch an ID from an index, and we try to get the associated Entry from > > the master table, if we get an error because the ID does not exist > anymore, > > then we should just ignore the error, and continue the search. > > > > But we still want to be sure that in some case, inside the server, we > still > > can have transactions over some searches. > > > > Thoughts ? > > > > I dont think having non transactional search is a good idea. I agree > there is a problem with non closed cursors but I dont think this is > the right way to solve it. We currently do not have transactions for > the search but a cursor over the jdbm B tree gets a snapshot view. > This snapshot view is not only for getting a snapshot view of the data > but also the structure itself. If you do not have this(and on top of > this if you dont have txns): > > - you will have to deal with inconsistencies in the Btree data structure > - you might get data as NULL from the Btree and you might have to > deal with it. Or you might have to deal with cases like you counted 10 > children but you actually end up with 9 children while doing a DFS > search over your data structure.This might look easy but I think it is > not. > - you might get not only stale data but complete garbage. This > garbage might confuse the code completely(for example if the garbage > you read was supposed to be a Btree redirect). > > Code from ldap protocol handlers down to search is written in a way > assuming cursors get consistent data. I dont think it is impossible to > write code expecting all kinds of inconsistencies but it is very > difficult and the code will be brittle. > > > As for the paged search, one way to deal with it would be to read all > the data from the cursors at the beginning of the paged search and > close the cursor. This would be similar to a normal search. If we get > worried about memory consumption of this, the entries to be returned > could be spilled over to temp files.You might say this might lead to > temp file that are never claimed but if there are not many of them > then no big deal. Users are supposed to deal with cleaning up their > contexts. Not doing is similar to opening file handles or socket > connections and never closing them. Such things are bound to create > problems. > > > > > > > > -- > > Regards, > > Cordialement, > > Emmanuel Lécharny > > www.iktek.com > > > > thanks > Selcuk > -- Best Regards, -- Alex
