[ 
https://issues.apache.org/jira/browse/FELIX-5471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15809817#comment-15809817
 ] 

Pierre De Rop commented on FELIX-5471:
--------------------------------------

Hello Jeroen,

Ok; so last week, I made an attempt to try to introduce some locking in the 
component state machine in order to block
unregistering threads until listeners are called (like blocking the thread 
which is unregistering X until M.unbind(X) is called).
So, I tried to introduce a "Future" object in the Component.schedule() method 
in order to block the thread which is calling handleEvent() method with a 
REMOVED event (as I did in the previous patch which I reverted).

However, I came across some deadlocks especially when you have some cycles 
between components.

Then I tried to invoke the Component.unregisterService() method outside of the 
Component queue, but I then had to create a new UNREGISTERING state and I had 
to also implement a mechanism in order to allow to schedule a task outside the 
queue.
But even after having done that, I then came across some remaining concurrency 
issues and some concurrent tests were not working anymore.

I then abandoned and finally had the opportunity to have a long discussion with 
Marcel, and we came to the conclusion that
implementing synchronous listener notification (that is: M.unbind(X)) for 
service unregistration (when S is unregistering) in all concurrent scenarios 
would definitely weaken a lot the locking model currently used in DM (it could 
be possible but at a very high cost and with a major refactoring, and it's not 
reasonable to do that for the moment).

that being said, there is now some other aspects to consider: if the X service 
is already stopped at the time M.unbind(X) is called, then the X service is 
said to be "stale" in the osgi spec, and in this case, the behavior of a 
"stale" service object that becomes unregistered is undefined.
Such service objects may continue to work properly or throw an exception at 
their discretion (see 5.7 "Stale Reference" in core spec), or the objects could 
simply silently ignore further method calls after they are stopped.

So, in the framework there is a runtime org.osgi.framework.ServiceException 
exception which you could throw from X component in case M.unbind(X) calls some 
X methods after X has been stopped.
That exception also allows you to specify a reason why the call failed, and 
M.unbind method could then just log it.

So, all in all, it's not worth doing a massive refactoring for the moment, it 
would be too dangerous. If now you would like avoid the situation where the 
stale components can be temporarily called when unbound from other services, 
then may be the option is for the moment to not use the concurrent mode of DM  
and use the single thread mode, as before (sorry about that).



> Ensure that unbound services are always handled synchronously
> -------------------------------------------------------------
>
>                 Key: FELIX-5471
>                 URL: https://issues.apache.org/jira/browse/FELIX-5471
>             Project: Felix
>          Issue Type: Bug
>          Components: Dependency Manager
>    Affects Versions: org.apache.felix.dependencymanager-r1
>            Reporter: Pierre De Rop
>            Assignee: Pierre De Rop
>             Fix For: org.apache.felix.dependencymanager-r9
>
>
> When a component loses a service dependency, it should handle the lost 
> service synchronously. For example, if service A loses a dependency on B 
> (because B is being unregistered),  then A.remove(B) should be called 
> synchronously (when B is being unregistered from the service registry), else 
> the A.remove(B) callback could possibly be invoked while B is already 
> unregistered and stopped.
> Currently, unbound services may be handled asynchronously if DM is used in a 
> concurrent mode (using a threadpool). And even if no threadpool is used, the 
> issue may happen if there is a highly concurrent situation where services are 
> registered/removed concurrently from multiple threads.
> So, a patch should be done in order to ensure that a service dependency 
> remove event is always handled synchronously (especially if DM is used with a 
> threadpool).
> I will provide a testcase soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to