[jira] [Commented] (FELIX-5471) Ensure that unbound services are always handled synchronously

2017-01-10 Thread Jeroen Daanen (JIRA)

[ 
https://issues.apache.org/jira/browse/FELIX-5471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15814707#comment-15814707
 ] 

Jeroen Daanen commented on FELIX-5471:
--

Ok, I understand that. 
Now we know that if using the dependency manager in parallel you might, in 
highly concurrent situations, encounter a service being removed which is 
already 'stale'/stopped and just have to account for that.
Thanks a lot for investigating and discussing this, Pierre!

> Ensure that unbound services are always handled synchronously
> -
>
> Key: FELIX-5471
> URL: https://issues.apache.org/jira/browse/FELIX-5471
> Project: Felix
>  Issue Type: Bug
>  Components: Dependency Manager
>Affects Versions: org.apache.felix.dependencymanager-r1
>Reporter: Pierre De Rop
>Assignee: Pierre De Rop
> Fix For: org.apache.felix.dependencymanager-r9
>
>
> When a component loses a service dependency, it should handle the lost 
> service synchronously. For example, if service A loses a dependency on B 
> (because B is being unregistered),  then A.remove(B) should be called 
> synchronously (when B is being unregistered from the service registry), else 
> the A.remove(B) callback could possibly be invoked while B is already 
> unregistered and stopped.
> Currently, unbound services may be handled asynchronously if DM is used in a 
> concurrent mode (using a threadpool). And even if no threadpool is used, the 
> issue may happen if there is a highly concurrent situation where services are 
> registered/removed concurrently from multiple threads.
> So, a patch should be done in order to ensure that a service dependency 
> remove event is always handled synchronously (especially if DM is used with a 
> threadpool).
> I will provide a testcase soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FELIX-5471) Ensure that unbound services are always handled synchronously

2017-01-08 Thread Pierre De Rop (JIRA)

[ 
https://issues.apache.org/jira/browse/FELIX-5471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15809817#comment-15809817
 ] 

Pierre De Rop commented on FELIX-5471:
--

Hello Jeroen,

Ok; so last week, I made an attempt to try to introduce some locking in the 
component state machine in order to block
unregistering threads until listeners are called (like blocking the thread 
which is unregistering X until M.unbind(X) is called).
So, I tried to introduce a "Future" object in the Component.schedule() method 
in order to block the thread which is calling handleEvent() method with a 
REMOVED event (as I did in the previous patch which I reverted).

However, I came across some deadlocks especially when you have some cycles 
between components.

Then I tried to invoke the Component.unregisterService() method outside of the 
Component queue, but I then had to create a new UNREGISTERING state and I had 
to also implement a mechanism in order to allow to schedule a task outside the 
queue.
But even after having done that, I then came across some remaining concurrency 
issues and some concurrent tests were not working anymore.

I then abandoned and finally had the opportunity to have a long discussion with 
Marcel, and we came to the conclusion that
implementing synchronous listener notification (that is: M.unbind(X)) for 
service unregistration (when S is unregistering) in all concurrent scenarios 
would definitely weaken a lot the locking model currently used in DM (it could 
be possible but at a very high cost and with a major refactoring, and it's not 
reasonable to do that for the moment).

that being said, there is now some other aspects to consider: if the X service 
is already stopped at the time M.unbind(X) is called, then the X service is 
said to be "stale" in the osgi spec, and in this case, the behavior of a 
"stale" service object that becomes unregistered is undefined.
Such service objects may continue to work properly or throw an exception at 
their discretion (see 5.7 "Stale Reference" in core spec), or the objects could 
simply silently ignore further method calls after they are stopped.

So, in the framework there is a runtime org.osgi.framework.ServiceException 
exception which you could throw from X component in case M.unbind(X) calls some 
X methods after X has been stopped.
That exception also allows you to specify a reason why the call failed, and 
M.unbind method could then just log it.

So, all in all, it's not worth doing a massive refactoring for the moment, it 
would be too dangerous. If now you would like avoid the situation where the 
stale components can be temporarily called when unbound from other services, 
then may be the option is for the moment to not use the concurrent mode of DM  
and use the single thread mode, as before (sorry about that).



> Ensure that unbound services are always handled synchronously
> -
>
> Key: FELIX-5471
> URL: https://issues.apache.org/jira/browse/FELIX-5471
> Project: Felix
>  Issue Type: Bug
>  Components: Dependency Manager
>Affects Versions: org.apache.felix.dependencymanager-r1
>Reporter: Pierre De Rop
>Assignee: Pierre De Rop
> Fix For: org.apache.felix.dependencymanager-r9
>
>
> When a component loses a service dependency, it should handle the lost 
> service synchronously. For example, if service A loses a dependency on B 
> (because B is being unregistered),  then A.remove(B) should be called 
> synchronously (when B is being unregistered from the service registry), else 
> the A.remove(B) callback could possibly be invoked while B is already 
> unregistered and stopped.
> Currently, unbound services may be handled asynchronously if DM is used in a 
> concurrent mode (using a threadpool). And even if no threadpool is used, the 
> issue may happen if there is a highly concurrent situation where services are 
> registered/removed concurrently from multiple threads.
> So, a patch should be done in order to ensure that a service dependency 
> remove event is always handled synchronously (especially if DM is used with a 
> threadpool).
> I will provide a testcase soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FELIX-5471) Ensure that unbound services are always handled synchronously

2017-01-05 Thread Jeroen Daanen (JIRA)

[ 
https://issues.apache.org/jira/browse/FELIX-5471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15801235#comment-15801235
 ] 

Jeroen Daanen commented on FELIX-5471:
--

Yes we are stopping components while some other components are starting 
concurrently.
For instance, we use the dependency manager to create services as being 
configured by an end user. So a configuration change may trigger the removal 
services, which will be replaced by other or new services. On top of that we 
have services launching other services, which are being required/used in other 
services.


> Ensure that unbound services are always handled synchronously
> -
>
> Key: FELIX-5471
> URL: https://issues.apache.org/jira/browse/FELIX-5471
> Project: Felix
>  Issue Type: Bug
>  Components: Dependency Manager
>Affects Versions: org.apache.felix.dependencymanager-r1
>Reporter: Pierre De Rop
>Assignee: Pierre De Rop
> Fix For: org.apache.felix.dependencymanager-r9
>
>
> When a component loses a service dependency, it should handle the lost 
> service synchronously. For example, if service A loses a dependency on B 
> (because B is being unregistered),  then A.remove(B) should be called 
> synchronously (when B is being unregistered from the service registry), else 
> the A.remove(B) callback could possibly be invoked while B is already 
> unregistered and stopped.
> Currently, unbound services may be handled asynchronously if DM is used in a 
> concurrent mode (using a threadpool). And even if no threadpool is used, the 
> issue may happen if there is a highly concurrent situation where services are 
> registered/removed concurrently from multiple threads.
> So, a patch should be done in order to ensure that a service dependency 
> remove event is always handled synchronously (especially if DM is used with a 
> threadpool).
> I will provide a testcase soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FELIX-5471) Ensure that unbound services are always handled synchronously

2017-01-04 Thread Pierre De Rop (JIRA)

[ 
https://issues.apache.org/jira/browse/FELIX-5471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799473#comment-15799473
 ] 

Pierre De Rop commented on FELIX-5471:
--

Hi Jeroen,

I have committed a patch in rv 1777378 in ComponentImpl class in order to 
clarify the javadoc for the schedule method, and I renamed the "trySynchronous" 
argument of the schedule method to "bypassThreadPoolIfPossible" in order to 
make the method less confusing. So if there is no threadpool, the flag is 
ignored and then the default serial queue executor is used (the task is 
scheduled synchronously if the queue is not currently being run by another 
master thread). If now a threadpool is used and if bypassThreadPoolIfPossible 
is true, then the task is also run synchronously if the queue is not currently 
being run from the threadpool.

So, whether or not a ComponentExecutorFactory is used, we always try to handle 
removed service events synchronously if the component is not currently handling 
a service dependency (if its queue is idle, not busy).

Now let's try to clarify so you can estimate if you may or may not have the 
issue (whether or not you use a ComponentExecutorFactory):

* First let's recap the issue: sometimes, when a service X is being 
unregistered, then the other services (M for example) depending on X may not be 
called in M.unbind(X) synchronously while X is being unregistered. So X may 
then be stopped at a point where M.unbind(X) is not yet called (but will be 
called eventually, soon).

* When the issue does not happen ? you won't have the ordering issue if you 
stop your components synchronously from a single thread and after all 
components have been started (like it is the case for example when the 
framework is shutting down or when you manually stop a bundle from the gogo 
shell).

* When the issue may happen ? you may have the issue if you concurrently add 
*and* remove some components at the same time. For example you may have the 
issue in the following use case (whether or not you use concurrent DM):
** X, and Y are available, and M optionally depends on X, Y.
** you add M from thread T1
** then concurrently, you remove X from thread T2

So, using the scenario above,  it may happen that M.unbind(X) is called after 
X.stop().
This is because we have implemented the thread model in a non blocking way, 
using queues.

So to summarize: if you manage to stop components from a single thread  and at 
a point where components are all fully started, then there is no issue.

Now, I also added two concurrent test cases:

- ServiceRaceWithOrderedUnbindTest.java: this junit test does not use DM in 
concurrent mode (no ComponentExecutorFactory is used, as it is the case in 
default DM). So, the test uses manually created threads in order to perform 
concurrent component creations. Then, the components are unregistered from a 
single thread. And the test verifies if the unbound methods are called 
synchronously while the lost service dependencies are being unregistered.

- ServiceRaceParallelWithOrderedUnbindTest.java: same test as above, but this 
time we are using a ComponentExecutorFactory (concurrent DM is used). 

So, are stopping components while some other components are starting ?

thank you.

> Ensure that unbound services are always handled synchronously
> -
>
> Key: FELIX-5471
> URL: https://issues.apache.org/jira/browse/FELIX-5471
> Project: Felix
>  Issue Type: Bug
>  Components: Dependency Manager
>Affects Versions: org.apache.felix.dependencymanager-r1
>Reporter: Pierre De Rop
>Assignee: Pierre De Rop
> Fix For: org.apache.felix.dependencymanager-r9
>
>
> When a component loses a service dependency, it should handle the lost 
> service synchronously. For example, if service A loses a dependency on B 
> (because B is being unregistered),  then A.remove(B) should be called 
> synchronously (when B is being unregistered from the service registry), else 
> the A.remove(B) callback could possibly be invoked while B is already 
> unregistered and stopped.
> Currently, unbound services may be handled asynchronously if DM is used in a 
> concurrent mode (using a threadpool). And even if no threadpool is used, the 
> issue may happen if there is a highly concurrent situation where services are 
> registered/removed concurrently from multiple threads.
> So, a patch should be done in order to ensure that a service dependency 
> remove event is always handled synchronously (especially if DM is used with a 
> threadpool).
> I will provide a testcase soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FELIX-5471) Ensure that unbound services are always handled synchronously

2017-01-02 Thread Jeroen Daanen (JIRA)

[ 
https://issues.apache.org/jira/browse/FELIX-5471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15792649#comment-15792649
 ] 

Jeroen Daanen commented on FELIX-5471:
--

I am glad you could remove the timeout. The new implementation looks good to 
me, only the comments on line 1695-1697 of ComponentImpl are a bit confusing 
(how would it be possible to don't have the SerialExecutor at that point when 
trySynchronous is true (except when someone overrides getExecutor)).

In order to create stable and reliable software, I do have to be sure that the 
order is guaranteed. Also I think the usage of the dependency manager is pretty 
concurrent in our software (when setting it to parallel), but I cannot estimate 
what the actual chance is of the order not being guaranteed? Can it be 
reproduced in a unit test? 
Anyway, I would really appreciate if this can be fixed and the order is always 
guaranteed.

> Ensure that unbound services are always handled synchronously
> -
>
> Key: FELIX-5471
> URL: https://issues.apache.org/jira/browse/FELIX-5471
> Project: Felix
>  Issue Type: Bug
>  Components: Dependency Manager
>Affects Versions: org.apache.felix.dependencymanager-r1
>Reporter: Pierre De Rop
>Assignee: Pierre De Rop
> Fix For: org.apache.felix.dependencymanager-r9
>
>
> When a component loses a service dependency, it should handle the lost 
> service synchronously. For example, if service A loses a dependency on B 
> (because B is being unregistered),  then A.remove(B) should be called 
> synchronously (when B is being unregistered from the service registry), else 
> the A.remove(B) callback could possibly be invoked while B is already 
> unregistered and stopped.
> Currently, unbound services may be handled asynchronously if DM is used in a 
> concurrent mode (using a threadpool). And even if no threadpool is used, the 
> issue may happen if there is a highly concurrent situation where services are 
> registered/removed concurrently from multiple threads.
> So, a patch should be done in order to ensure that a service dependency 
> remove event is always handled synchronously (especially if DM is used with a 
> threadpool).
> I will provide a testcase soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FELIX-5471) Ensure that unbound services are always handled synchronously

2017-01-01 Thread Jeroen Daanen (JIRA)

[ 
https://issues.apache.org/jira/browse/FELIX-5471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15791022#comment-15791022
 ] 

Jeroen Daanen commented on FELIX-5471:
--

Indeed this solves the problems we talked about in the felix user mailing list.
However, I am wondering wether it is a good decision to introduce the timeout. 
If I am correct, the timeout for instance occurs when handling the remove 
callback takes a long time e.g. when the component is just processing the 
services its requires in a synchronized block, while the remove callback is 
synchronized on the same, or the implementation of the remove call back 
triggers a lot. This could easily take some time depending on your 
implementation, so you could run into this timeout pretty quickly. 
My suggestion would be to remove the timeout and the configurable property and 
just wait endlessly, because if thats happening I think there is another 
programming error (a deadlock or something) which must not be just 'ignored' by 
the warning message which is logged now when the timeout occurs.
What do you think Pierre?

> Ensure that unbound services are always handled synchronously
> -
>
> Key: FELIX-5471
> URL: https://issues.apache.org/jira/browse/FELIX-5471
> Project: Felix
>  Issue Type: Bug
>  Components: Dependency Manager
>Affects Versions: org.apache.felix.dependencymanager-r1
>Reporter: Pierre De Rop
>Assignee: Pierre De Rop
> Fix For: org.apache.felix.dependencymanager-r9
>
>
> When a component loses a service dependency, it should handle the lost 
> service synchronously. For example, if service A loses a dependency on B 
> (because B is being unregistered),  then A.remove(B) should be called 
> synchronously (when B is being unregistered from the service registry), else 
> the A.remove(B) callback could possibly be invoked while B is already 
> unregistered and stopped.
> Currently, unbound services may be handled asynchronously if DM is used in a 
> concurrent mode (using a threadpool). And even if no threadpool is used, the 
> issue may happen if there is a highly concurrent situation where services are 
> registered/removed concurrently from multiple threads.
> So, a patch should be done in order to ensure that a service dependency 
> remove event is always handled synchronously (especially if DM is used with a 
> threadpool).
> I will provide a testcase soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FELIX-5471) Ensure that unbound services are always handled synchronously

2016-12-30 Thread Pierre De Rop (JIRA)

[ 
https://issues.apache.org/jira/browse/FELIX-5471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15787762#comment-15787762
 ] 

Pierre De Rop commented on FELIX-5471:
--

Committed patch in 1776574.

A new schedule(boolean synchronous, Runnable task) has been added in 
ComponentImpl.java, allowing to possibly schedule a task synchronously through 
the internal component executor queue. The method uses a timeout of 30 seconds 
to protect against infinite wait, and the timeout can be configured using the 
DependencyManager.SCHEDULE_TIMEOUT constant, which is a bundle context 
property. it may take a value in  millis. (3 by default).

So, ComponentImpl class is now using the new schedule method in order to handle 
removed dependency events synchronously. The ComponentImpl.stop() method is now 
also using the same schedule method, and the InvokcationUtil.invokeUpdated 
callback is also reusing the DependencyManager.SCHEDULE_TIMEOUT constant, when 
handling CM configuration update events.

> Ensure that unbound services are always handled synchronously
> -
>
> Key: FELIX-5471
> URL: https://issues.apache.org/jira/browse/FELIX-5471
> Project: Felix
>  Issue Type: Bug
>  Components: Dependency Manager
>Affects Versions: org.apache.felix.dependencymanager-r1
>Reporter: Pierre De Rop
>Assignee: Pierre De Rop
> Fix For: org.apache.felix.dependencymanager-r9
>
>
> When a component loses a service dependency, it should handle the lost 
> service synchronously. For example, if service A loses a dependency on B 
> (because B is being unregistered),  then A.remove(B) should be called 
> synchronously (when B is being unregistered from the service registry), else 
> the A.remove(B) callback could possibly be invoked while B is already 
> unregistered and stopped.
> Currently, unbound services may be handled asynchronously if DM is used in a 
> concurrent mode (using a threadpool). And even if no threadpool is used, the 
> issue may happen if there is a highly concurrent situation where services are 
> registered/removed concurrently from multiple threads.
> So, a patch should be done in order to ensure that a service dependency 
> remove event is always handled synchronously (especially if DM is used with a 
> threadpool).
> I will provide a testcase soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FELIX-5471) Ensure that unbound services are always handled synchronously

2016-12-30 Thread Pierre De Rop (JIRA)

[ 
https://issues.apache.org/jira/browse/FELIX-5471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15787744#comment-15787744
 ] 

Pierre De Rop commented on FELIX-5471:
--

Added test in revision 1776573:

org.apache.felix.dependencymanager.itest/src/org/apache/felix/dm/itest/api/FELIX5471_SynchronousUnbindTest.java

> Ensure that unbound services are always handled synchronously
> -
>
> Key: FELIX-5471
> URL: https://issues.apache.org/jira/browse/FELIX-5471
> Project: Felix
>  Issue Type: Bug
>  Components: Dependency Manager
>Affects Versions: org.apache.felix.dependencymanager-r1
>Reporter: Pierre De Rop
>Assignee: Pierre De Rop
> Fix For: org.apache.felix.dependencymanager-r9
>
>
> When a component loses a service dependency, it should handle the lost 
> service synchronously. For example, if service A loses a dependency on B 
> (because B is being unregistered),  then A.remove(B) should be called 
> synchronously (when B is being unregistered from the service registry), else 
> the A.remove(B) callback could possibly be invoked while B is already 
> unregistered and stopped.
> Currently, unbound services may be handled asynchronously if DM is used in a 
> concurrent mode (using a threadpool). And even if no threadpool is used, the 
> issue may happen if there is a highly concurrent situation where services are 
> registered/removed concurrently from multiple threads.
> So, a patch should be done in order to ensure that a service dependency 
> remove event is always handled synchronously (especially if DM is used with a 
> threadpool).
> I will provide a testcase soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)