Re: Release troubles and failing tests

Alex Karasulu Thu, 10 May 2012 10:25:27 -0700

I am in agreement with Selcuk's analysis. I did not presume just how nasty
the inconsistency handling would get.


On Thu, May 10, 2012 at 8:18 PM, Selcuk AYA <[email protected]> wrote:

> On Thu, May 10, 2012 at 5:51 AM, Emmanuel Lécharny <[email protected]>
> wrote:
> > Le 5/10/12 9:58 AM, Emmanuel Lécharny a écrit :
> >
> >> Le 5/10/12 7:57 AM, Selcuk AYA a écrit :
> >>>
> >>> The problem seems to be caused by the test
> >>> testPagedSearchWrongCookie(). This tests failure in pages search by
> >>> sending a bad cookie. After failing, it relies on ctx.close() to
> >>> cleanup the session. Cleanup of the session will close all the cursors
> >>> related to paged searches through the session.
> >>>
> >>> It seems that somehow ctx.close does not result in an unbind message
> >>> at the server side time to time. I do not know what causes this but
> >>> this leaves a cursor open(specifically a NoDups cursor on rdn index).
> >>> Eventually as changes happen to the Rdn index, we run out of freeable
> >>> cache headers. After ignoring this test, pagedsearchit and searchit
> >>> pass fine together. It would be good to understand why arrival of
> >>> unbind message is a hit and miss case in this test.
> >>
> >>
> >> It's absolutly strange... Neither an UnbindRequest nor an AbandonRequest
> >> is sent by JNDI when closing the context, which is a huge bug.
> >>
> >> I have checked the other tests, and an Ubind request is always sent when
> >> we close teh context, except when we get an UnwillingToPerform
> exception.
> >> It seems like the context is in a state where it considers that no
> unbind
> >> should be send after an exception. Although I can do a lookup (and get
> back
> >> the correct response from the server after this excption), the
> connection is
> >> still borked :/
> >>
> >> I'll try to rewite the test using our API to see if it works better, and
> >> investigate with som Sun guys to see if there is an issue in JNDI.
> >>
> >>
> >>
> > Ok, we have had a long discussion with Alex about this problem...
> >
> > The thing is that even for standard PagedSearch, where everything goes
> fine
> > (ie, when the client is done, he has correctly closed the connextion,
> which
> > sends a UbindRequest, which close the cursor etc), we may have dozens of
> > opened cursors for some extend period of time.
> >
> > At some point, we may have a exhausted cache, with no way to evict any
> > elements from it, leading to a server freeze.
> >
> > Not something we can accept from a LDAP server...
> >
> > A suggestion would be to add some parameter in the OperationContext
> telling
> > the underlying layer that a search is done outside of any transaction.
> When
> > we fetch an ID from an index, and we try to get the associated Entry from
> > the master table, if we get an error  because the ID does not exist
> anymore,
> > then we should just ignore the error, and continue the search.
> >
> > But we still want to be sure that in some case, inside the server, we
> still
> > can have transactions over some searches.
> >
> > Thoughts ?
> >
>
> I dont think having non transactional search is a good idea. I agree
> there is a problem with non closed cursors but I dont think this is
> the right way to solve it. We currently do not have transactions for
> the search but a cursor over the jdbm  B tree gets a snapshot view.
> This snapshot view is not only for getting a snapshot view of the data
> but also the structure itself. If you do not have this(and on top of
> this if you dont have txns):
>
>  - you will have to deal with inconsistencies in the Btree data structure
>  - you might get data as NULL from the Btree and you might have to
> deal with it. Or you might have to deal with cases like you counted 10
> children but you actually end up with 9 children while doing a DFS
> search over your data structure.This might look easy but I think it is
> not.
>  - you might get not only stale data but complete garbage. This
> garbage might confuse the code completely(for example if the garbage
> you read was supposed to be a Btree redirect).
>
> Code from ldap protocol handlers down to search is written in a way
> assuming cursors get consistent data. I dont think it is impossible to
> write code expecting all kinds of inconsistencies but it is very
> difficult and the code will be brittle.
>
>
> As for the paged search, one way to deal with it would be to read all
> the data from the cursors at the beginning of the paged search and
> close the cursor. This would be similar to a normal search. If we get
> worried about memory consumption of this, the entries to be returned
> could be spilled over to temp files.You might say this might lead to
> temp file that are never claimed but if there are not many of them
> then no big deal. Users are supposed to deal with cleaning up their
> contexts. Not doing is similar to opening file handles or socket
> connections and never closing them. Such things are bound to create
> problems.
>
>
> >
> >
> > --
> > Regards,
> > Cordialement,
> > Emmanuel Lécharny
> > www.iktek.com
> >
>
> thanks
> Selcuk
>



-- 
Best Regards,
-- Alex

Re: Release troubles and failing tests

Reply via email to