On Tue, Jan 1, 2013 at 7:58 AM, Richard S. Hall <[email protected]>wrote:
> On 1/1/13 10:37, Jad Naous wrote: > >> On Tue, Jan 1, 2013 at 6:21 AM, Richard S. Hall<[email protected]>** >> wrote: >> >> On 1/1/13 06:09, Jad Naous wrote: >>> >>> Happy new year! >>>> >>>> I'm running into another deadlock now. It seems like there needs to be a >>>> more rigorous study of how locking is used in the framework. In >>>> particular, >>>> it seems like the framework should not be invoking any listeners within >>>> the >>>> same thread that's executing the stopping/starting/refreshing of >>>> bundles... >>>> Anywhere that happens there will be a potential for a deadlock because >>>> of >>>> a >>>> misordering of lock acquisition. The framework should never call into >>>> user >>>> code with any locks held. >>>> >>>> Yeah, tell me about it, but it is not possible in all cases, >>> unfortunately. I personally feel that all events should be asynchronous, >>> but that's another story. >>> >>> -> richard >>> >>> >>> I'm happy to help fix this, but I need some pointers as to what needs to >> happen. For synchronous events, what are the requirements? Or is this not >> fixable? Can the lock be released before firing the events? >> > > I don't think it can be fixed, some of these things are baked into the > spec. I think the spec even states somewhere that synchronous event > listeners must be careful since they may be holding framework locks, so > they shouldn't try to do too much. Of course, that is not easy advice to > follow since framework impls vary. > > We even have bugs open that say we aren't holding locks when we should be, > e.g.: > > > https://issues.apache.org/**jira/browse/FELIX-3806<https://issues.apache.org/jira/browse/FELIX-3806> > > This is one of the poorly designed parts of the OSGi spec. It was meant to > work in a world were services can "come and go at anytime", but it allows > users to clings to synchronous events to do tons of work. > Then I guess this is something that ipojo is doing incorrectly? > -> richard > > >> Name: Thread-2 >>>> State: WAITING on [Ljava.lang.Object;@4a018e1b >>>> Total blocked: 38,871,649 Total waited: 38,871,650 >>>> >>>> Stack trace: >>>> java.lang.Object.wait(Native Method) >>>> java.lang.Object.wait(Object.****java:485) >>>> org.apache.felix.framework.****Felix.acquireGlobalLock(Felix.** >>>> **java:5033) >>>> org.apache.felix.framework.****StatefulResolver.resolve(** >>>> StatefulResolver.java:451) >>>> org.apache.felix.framework.****BundleWiringImpl.**** >>>> searchDynamicImports(** >>>> BundleWiringImpl.java:1578) >>>> org.apache.felix.framework.****BundleWiringImpl.** >>>> findClassOrResourceByDelegatio****n(BundleWiringImpl.java:**1478) >>>> org.apache.felix.framework.****BundleWiringImpl.access$400(** >>>> BundleWiringImpl.java:75) >>>> org.apache.felix.framework.****BundleWiringImpl$** >>>> BundleClassLoader.loadClass(****BundleWiringImpl.java:1882) >>>> java.lang.ClassLoader.****loadClass(ClassLoader.java:****247) >>>> org.apache.felix.framework.****BundleWiringImpl.**** >>>> getClassByDelegation(** >>>> BundleWiringImpl.java:1356) >>>> org.apache.felix.framework.****ServiceRegistrationImpl$** >>>> ServiceReferenceImpl.****isAssignableTo(**** >>>> ServiceRegistrationImpl.java:**** >>>> 548) >>>> org.apache.felix.framework.****util.Util.isServiceAssignable(**** >>>> Util.java:280) >>>> org.apache.felix.framework.****util.EventDispatcher.** >>>> invokeServiceListenerCallback(****EventDispatcher.java:916) >>>> org.apache.felix.framework.****util.EventDispatcher.**** >>>> fireEventImmediately( >>>> **EventDispatcher.java:793) >>>> org.apache.felix.framework.****util.EventDispatcher.**** >>>> fireServiceEvent(** >>>> EventDispatcher.java:543) >>>> org.apache.felix.framework.****Felix.fireServiceEvent(Felix.*** >>>> *java:4346) >>>> org.apache.felix.framework.****Felix.registerService(Felix.**** >>>> java:3356) >>>> org.apache.felix.framework.****BundleContextImpl.****registerService(** >>>> BundleContextImpl.java:346) >>>> org.apache.felix.ipojo.****IPojoFactory.start(**** >>>> IPojoFactory.java:613) >>>> - locked org.apache.felix.ipojo.****ComponentFactory@468034b6 >>>> org.apache.felix.ipojo.****Extender.****createAbstractFactory(** >>>> Extender.java:520) >>>> org.apache.felix.ipojo.****Extender.parse(Extender.java:****301) >>>> org.apache.felix.ipojo.****Extender.startManagementFor(**** >>>> Extender.java:237) >>>> org.apache.felix.ipojo.****Extender.access$600(Extender.****java:52) >>>> org.apache.felix.ipojo.****Extender$CreatorThread.run(**** >>>> Extender.java:769) >>>> java.lang.Thread.run(Thread.****java:662) >>>> >>>> Name: FelixFrameworkWiring >>>> State: BLOCKED on org.apache.felix.ipojo.**** >>>> ComponentFactory@468034b6owned >>>> >>>> by: Thread-2 >>>> Total blocked: 7 Total waited: 1 >>>> >>>> Stack trace: >>>> org.apache.felix.ipojo.****IPojoFactory.**** >>>> removeFactoryStateListener(** >>>> IPojoFactory.java:511) >>>> org.apache.felix.ipojo.****InstanceCreator.removeFactory(**** >>>> InstanceCreator.java:199) >>>> org.apache.felix.ipojo.****Extender.closeManagementFor(**** >>>> Extender.java:180) >>>> org.apache.felix.ipojo.****Extender.bundleChanged(**** >>>> Extender.java:153) >>>> org.apache.felix.framework.****util.EventDispatcher.** >>>> invokeBundleListenerCallback(****EventDispatcher.java:868) >>>> org.apache.felix.framework.****util.EventDispatcher.**** >>>> fireEventImmediately( >>>> **EventDispatcher.java:789) >>>> org.apache.felix.framework.****util.EventDispatcher.**** >>>> fireBundleEvent(** >>>> EventDispatcher.java:514) >>>> org.apache.felix.framework.****Felix.fireBundleEvent(Felix.**** >>>> java:4330) >>>> org.apache.felix.framework.****Felix.stopBundle(Felix.java:****2451) >>>> org.apache.felix.framework.****Felix$RefreshHelper.stop(**** >>>> Felix.java:4715) >>>> org.apache.felix.framework.****Felix.refreshPackages(Felix.**** >>>> java:4037) >>>> org.apache.felix.framework.****FrameworkWiringImpl.run(** >>>> FrameworkWiringImpl.java:178) >>>> java.lang.Thread.run(Thread.****java:662) >>>> >>>> >>>> >>>> >>>> On Sat, Dec 22, 2012 at 6:39 PM, Richard S. Hall <[email protected] >>>> >>>>> wrote: >>>>> >>>> On 12/22/12 20:41 , Jad Naous wrote: >>>> >>>>> Thanks! I was building with java6. Don't know if that is the issue. >>>>> >>>>>> Anyway, >>>>>> I tested the snapshot, and looks like that fixes it. >>>>>> >>>>>> Any idea when 4.1.0 will be released? Do you know if >>>>>> http://svn.apache.org/viewvc?******view=revision&revision=****1421958<http://svn.apache.org/viewvc?****view=revision&revision=**1421958> >>>>>> <http://svn.apache.org/**viewvc?**view=revision&**revision=1421958<http://svn.apache.org/viewvc?**view=revision&revision=1421958> >>>>>> > >>>>>> <http://svn.apache.org/****viewvc?view=revision&revision=****1421958<http://svn.apache.org/**viewvc?view=revision&revision=**1421958> >>>>>> <http://svn.apache.**org/viewvc?view=revision&**revision=1421958<http://svn.apache.org/viewvc?view=revision&revision=1421958> >>>>>> >>will >>>>>> >>>>>> apply >>>>>> >>>>>> cleanly onto 4.0.3? Otherwise, how stable do you think is >>>>>> 4.1.0-SNAPSHOT? >>>>>> >>>>>> The release version will be 4.2.0, but the 4.1.0-SNAPSHOT build >>>>>> should >>>>>> >>>>> be >>>>> reasonably stable. We try to keep trunk stable. >>>>> >>>>> I want to try to get a release out soon, but I don't have a specific >>>>> time >>>>> table. I'll try to get in done in January if all goes well. >>>>> >>>>> -> richard >>>>> >>>>> >>>>> Thanks! >>>>> >>>>>> Jad. >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Dec 21, 2012 at 6:49 AM, Richard S. Hall < >>>>>> [email protected] >>>>>> >>>>>> wrote: >>>>>>> >>>>>>> It built fine for me. I'm was building with Java 7. >>>>>> >>>>>> Regardless, I deployed snapshots of framework, main, and >>>>>>> main.distribution, so just grab what you want from the Apache >>>>>>> snapshot >>>>>>> repo >>>>>>> to try it out. >>>>>>> >>>>>>> -> richard >>>>>>> >>>>>>> >>>>>>> On 12/20/12 18:56 , Jad Naous wrote: >>>>>>> >>>>>>> It does look like the same issue. >>>>>>> >>>>>>> Got the trunk/framework. mvn clean install gives: >>>>>>>> >>>>>>>> [INFO] --- maven-compiler-plugin:2.3.2:********compile >>>>>>>> >>>>>>>> (default-compile) @ >>>>>>>> org.apache.felix.framework --- >>>>>>>> Dec 20, 2012 3:54:59 PM org.sonatype.guice.bean.**** >>>>>>>> reflect.Logs$JULSink >>>>>>>> warn >>>>>>>> WARNING: Error injecting: org.apache.maven.plugin.******** >>>>>>>> CompilerMojo >>>>>>>> java.lang.********NoClassDefFoundError: >>>>>>>> org/codehaus/plexus/compiler/********util/scan/****** >>>>>>>> >>>>>>>> InclusionScanException >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> jad. >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Dec 20, 2012 at 3:22 PM, Richard S. Hall < >>>>>>>> [email protected] >>>>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> On 12/20/12 4:10 PM, Jad Naous wrote: >>>>>>>>> >>>>>>>> If a bundle undergoes a refresh while ipojo is still >>>>>>>> initializing >>>>>>>> >>>>>>>>> components, a deadlock can happen. The issue is that if ipojo is >>>>>>>>> >>>>>>>>>> attempting to register a service, it will be doing it while >>>>>>>>>> synchronizing >>>>>>>>>> on the InstanceCreator instance. It will then try to register a >>>>>>>>>> service >>>>>>>>>> which requires the framework's global lock. >>>>>>>>>> >>>>>>>>>> Registering a service doesn't require a global, just a bundle >>>>>>>>>> lock. >>>>>>>>>> I >>>>>>>>>> >>>>>>>>>> think this could be related to: >>>>>>>>>> >>>>>>>>> >>>>>>>>> https://issues.apache.org/**********jira/browse/FELIX-3761<https://issues.apache.org/********jira/browse/FELIX-3761> >>>>>>>>> <htt**ps://issues.apache.org/********jira/browse/FELIX-3761<https://issues.apache.org/******jira/browse/FELIX-3761> >>>>>>>>> > >>>>>>>>> <https**://issues.apache.org/******jira/**browse/FELIX-3761<http://issues.apache.org/****jira/**browse/FELIX-3761> >>>>>>>>> <ht**tps://issues.apache.org/******jira/browse/FELIX-3761<https://issues.apache.org/****jira/browse/FELIX-3761> >>>>>>>>> > >>>>>>>>> <https:/**/issues.apache.org/******jira/**browse/FELIX-3761<http://issues.apache.org/****jira/**browse/FELIX-3761> >>>>>>>>> <ht**tp://issues.apache.org/**jira/****browse/FELIX-3761<http://issues.apache.org/**jira/**browse/FELIX-3761> >>>>>>>>> > >>>>>>>>> <http**s://issues.apache.org/****jira/**browse/FELIX-3761<http://issues.apache.org/**jira/**browse/FELIX-3761> >>>>>>>>> <http**s://issues.apache.org/**jira/**browse/FELIX-3761<https://issues.apache.org/**jira/browse/FELIX-3761> >>>>>>>>> > >>>>>>>>> <https:/**/issues.apache.org/******jira/**browse/FELIX-3761<http://issues.apache.org/****jira/**browse/FELIX-3761> >>>>>>>>> <ht**tp://issues.apache.org/**jira/****browse/FELIX-3761<http://issues.apache.org/**jira/**browse/FELIX-3761> >>>>>>>>> > >>>>>>>>> <http**://issues.apache.org/**jira/****browse/FELIX-3761<http://issues.apache.org/jira/****browse/FELIX-3761> >>>>>>>>> <htt**p://issues.apache.org/jira/****browse/FELIX-3761<http://issues.apache.org/jira/**browse/FELIX-3761> >>>>>>>>> > >>>>>>>>> <https**://issues.apache.org/****jira/**browse/FELIX-3761<http://issues.apache.org/**jira/**browse/FELIX-3761> >>>>>>>>> <http**://issues.apache.org/jira/****browse/FELIX-3761<http://issues.apache.org/jira/**browse/FELIX-3761> >>>>>>>>> > >>>>>>>>> >>>>>>>>> <https**://issues.apache.org/**jira/**browse/FELIX-3761<http://issues.apache.org/jira/**browse/FELIX-3761> >>>>>>>>> <https**://issues.apache.org/jira/**browse/FELIX-3761<https://issues.apache.org/jira/browse/FELIX-3761> >>>>>>>>> > >>>>>>>>> This avoid grabbing the bundle lock when registering a service, so >>>>>>>>> maybe >>>>>>>>> it will help your situation. You could try to build the framework >>>>>>>>> from >>>>>>>>> trunk and see if it makes a difference. >>>>>>>>> >>>>>>>>> If you aren't able to build from trunk, let me know and I'll try to >>>>>>>>> publish a snapshot build since I don't think we have a recent one >>>>>>>>> (we >>>>>>>>> should do this no matter what). >>>>>>>>> >>>>>>>>> -> richard >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> If a refresh is happening in another thread, the refresh will >>>>>>>>> be >>>>>>>>> holding >>>>>>>>> >>>>>>>>> the framework's global lock, which will then call IPOJO's >>>>>>>>> extender, >>>>>>>>> >>>>>>>>>> which >>>>>>>>>> then attempts to call a method on InstanceCreator, hence leading >>>>>>>>>> to >>>>>>>>>> the >>>>>>>>>> deadlock. >>>>>>>>>> >>>>>>>>>> Here are the stack traces: >>>>>>>>>> >>>>>>>>>> Daemon Thread [Thread-1] (Suspended) >>>>>>>>>> Object.wait(long) line: not available [native method] >>>>>>>>>> Object[](Object).wait() line: 485 >>>>>>>>>> Felix.acquireBundleLock(**********BundleImpl, int) line: 4871 >>>>>>>>>> Felix.registerService(**********BundleImpl, String[], Object, >>>>>>>>>> Dictionary) >>>>>>>>>> line: >>>>>>>>>> 3205 >>>>>>>>>> BundleContextImpl.**********registerService(String[], Object, >>>>>>>>>> Dictionary) >>>>>>>>>> line: >>>>>>>>>> 346 >>>>>>>>>> >>>>>>>>>> IPojoContext.registerService(**********String[], Object, >>>>>>>>>> Dictionary) >>>>>>>>>> line: >>>>>>>>>> 385 >>>>>>>>>> ProvidedService.**********registerService() line: 362 >>>>>>>>>> ProvidedServiceHandler.__M_**********stateChanged(int) line: 509 >>>>>>>>>> ProvidedServiceHandler.**********stateChanged(int) line: not >>>>>>>>>> >>>>>>>>>> available >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> InstanceManager.setState(int) line: 536 >>>>>>>>>> InstanceManager.start() line: 418 >>>>>>>>>> ComponentFactory.**********createInstance(Dictionary, >>>>>>>>>> IPojoContext, >>>>>>>>>> HandlerManager[]) >>>>>>>>>> line: 179 >>>>>>>>>> >>>>>>>>>> ComponentFactory(IPojoFactory)**********.** >>>>>>>>>> createComponentInstance(****** >>>>>>>>>> Dictionary, >>>>>>>>>> ServiceContext) line: 310 >>>>>>>>>> ComponentFactory(IPojoFactory)**********.** >>>>>>>>>> createComponentInstance(****** >>>>>>>>>> Dictionary) >>>>>>>>>> line: 239 >>>>>>>>>> InstanceCreator$**********ManagedInstance.create(********** >>>>>>>>>> IPojoFactory) >>>>>>>>>> line: 355 >>>>>>>>>> InstanceCreator.addInstance(**********Dictionary, long) line: 89 >>>>>>>>>> Extender.parse(Bundle, String) line: 306 >>>>>>>>>> Extender.startManagementFor(**********Bundle) line: 237 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Extender.access$600(Extender, Bundle) line: 52 >>>>>>>>>> Extender$CreatorThread.run() line: 769 >>>>>>>>>> Thread.run() line: 662 >>>>>>>>>> >>>>>>>>>> Daemon Thread [FelixFrameworkWiring] (Suspended) >>>>>>>>>> InstanceCreator.**********removeInstancesFromBundle(****** >>>>>>>>>> ****long) >>>>>>>>>> line: 116 >>>>>>>>>> Extender.closeManagementFor(**********Bundle) line: 171 >>>>>>>>>> Extender.bundleChanged(**********BundleEvent) line: 153 >>>>>>>>>> EventDispatcher.**********invokeBundleListenerCallback(***** >>>>>>>>>> *****Bundle, >>>>>>>>>> EventListener, >>>>>>>>>> EventObject) line: 868 >>>>>>>>>> EventDispatcher.**********fireEventImmediately(******** >>>>>>>>>> >>>>>>>>>> EventDispatcher, >>>>>>>>>> >>>>>>>>>> int, Map, >>>>>>>>>> EventObject, Dictionary) line: 789 >>>>>>>>>> EventDispatcher.**********fireBundleEvent(BundleEvent, Framework) >>>>>>>>>> >>>>>>>>>> line: >>>>>>>>>> >>>>>>>>>> 514 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Felix.fireBundleEvent(int, Bundle) line: 4244 >>>>>>>>>> Felix.stopBundle(BundleImpl, boolean) line: 2351 >>>>>>>>>> Felix$RefreshHelper.stop() line: 4629 >>>>>>>>>> Felix.refreshPackages(**********Collection, FrameworkListener[]) >>>>>>>>>> >>>>>>>>>> line: >>>>>>>>>> >>>>>>>>>> 3951 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> FrameworkWiringImpl.run() line: 172 >>>>>>>>>> Thread.run() line: 662 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >
