On 1/1/13 11:13, Jad Naous wrote:
On Tue, Jan 1, 2013 at 7:58 AM, Richard S. Hall <[email protected]>wrote:On 1/1/13 10:37, Jad Naous wrote:On Tue, Jan 1, 2013 at 6:21 AM, Richard S. Hall<[email protected]>** wrote: On 1/1/13 06:09, Jad Naous wrote:Happy new year!I'm running into another deadlock now. It seems like there needs to be a more rigorous study of how locking is used in the framework. In particular, it seems like the framework should not be invoking any listeners within the same thread that's executing the stopping/starting/refreshing of bundles... Anywhere that happens there will be a potential for a deadlock because of a misordering of lock acquisition. The framework should never call into user code with any locks held. Yeah, tell me about it, but it is not possible in all cases,unfortunately. I personally feel that all events should be asynchronous, but that's another story. -> richard I'm happy to help fix this, but I need some pointers as to what needs tohappen. For synchronous events, what are the requirements? Or is this not fixable? Can the lock be released before firing the events?I don't think it can be fixed, some of these things are baked into the spec. I think the spec even states somewhere that synchronous event listeners must be careful since they may be holding framework locks, so they shouldn't try to do too much. Of course, that is not easy advice to follow since framework impls vary. We even have bugs open that say we aren't holding locks when we should be, e.g.: https://issues.apache.org/**jira/browse/FELIX-3806<https://issues.apache.org/jira/browse/FELIX-3806> This is one of the poorly designed parts of the OSGi spec. It was meant to work in a world were services can "come and go at anytime", but it allows users to clings to synchronous events to do tons of work.Then I guess this is something that ipojo is doing incorrectly?
Hard for me to say, but if there is a case where we can fire events without holding locks, then you are correct in saying that we should try to avoid doing so. Likewise, users should avoid doing too much when they receive synchronous events.
-> richard
-> richardName: Thread-2State: WAITING on [Ljava.lang.Object;@4a018e1b Total blocked: 38,871,649 Total waited: 38,871,650 Stack trace: java.lang.Object.wait(Native Method) java.lang.Object.wait(Object.****java:485) org.apache.felix.framework.****Felix.acquireGlobalLock(Felix.** **java:5033) org.apache.felix.framework.****StatefulResolver.resolve(** StatefulResolver.java:451) org.apache.felix.framework.****BundleWiringImpl.**** searchDynamicImports(** BundleWiringImpl.java:1578) org.apache.felix.framework.****BundleWiringImpl.** findClassOrResourceByDelegatio****n(BundleWiringImpl.java:**1478) org.apache.felix.framework.****BundleWiringImpl.access$400(** BundleWiringImpl.java:75) org.apache.felix.framework.****BundleWiringImpl$** BundleClassLoader.loadClass(****BundleWiringImpl.java:1882) java.lang.ClassLoader.****loadClass(ClassLoader.java:****247) org.apache.felix.framework.****BundleWiringImpl.**** getClassByDelegation(** BundleWiringImpl.java:1356) org.apache.felix.framework.****ServiceRegistrationImpl$** ServiceReferenceImpl.****isAssignableTo(**** ServiceRegistrationImpl.java:**** 548) org.apache.felix.framework.****util.Util.isServiceAssignable(**** Util.java:280) org.apache.felix.framework.****util.EventDispatcher.** invokeServiceListenerCallback(****EventDispatcher.java:916) org.apache.felix.framework.****util.EventDispatcher.**** fireEventImmediately( **EventDispatcher.java:793) org.apache.felix.framework.****util.EventDispatcher.**** fireServiceEvent(** EventDispatcher.java:543) org.apache.felix.framework.****Felix.fireServiceEvent(Felix.*** *java:4346) org.apache.felix.framework.****Felix.registerService(Felix.**** java:3356) org.apache.felix.framework.****BundleContextImpl.****registerService(** BundleContextImpl.java:346) org.apache.felix.ipojo.****IPojoFactory.start(**** IPojoFactory.java:613) - locked org.apache.felix.ipojo.****ComponentFactory@468034b6 org.apache.felix.ipojo.****Extender.****createAbstractFactory(** Extender.java:520) org.apache.felix.ipojo.****Extender.parse(Extender.java:****301) org.apache.felix.ipojo.****Extender.startManagementFor(**** Extender.java:237) org.apache.felix.ipojo.****Extender.access$600(Extender.****java:52) org.apache.felix.ipojo.****Extender$CreatorThread.run(**** Extender.java:769) java.lang.Thread.run(Thread.****java:662) Name: FelixFrameworkWiring State: BLOCKED on org.apache.felix.ipojo.**** ComponentFactory@468034b6owned by: Thread-2 Total blocked: 7 Total waited: 1 Stack trace: org.apache.felix.ipojo.****IPojoFactory.**** removeFactoryStateListener(** IPojoFactory.java:511) org.apache.felix.ipojo.****InstanceCreator.removeFactory(**** InstanceCreator.java:199) org.apache.felix.ipojo.****Extender.closeManagementFor(**** Extender.java:180) org.apache.felix.ipojo.****Extender.bundleChanged(**** Extender.java:153) org.apache.felix.framework.****util.EventDispatcher.** invokeBundleListenerCallback(****EventDispatcher.java:868) org.apache.felix.framework.****util.EventDispatcher.**** fireEventImmediately( **EventDispatcher.java:789) org.apache.felix.framework.****util.EventDispatcher.**** fireBundleEvent(** EventDispatcher.java:514) org.apache.felix.framework.****Felix.fireBundleEvent(Felix.**** java:4330) org.apache.felix.framework.****Felix.stopBundle(Felix.java:****2451) org.apache.felix.framework.****Felix$RefreshHelper.stop(**** Felix.java:4715) org.apache.felix.framework.****Felix.refreshPackages(Felix.**** java:4037) org.apache.felix.framework.****FrameworkWiringImpl.run(** FrameworkWiringImpl.java:178) java.lang.Thread.run(Thread.****java:662) On Sat, Dec 22, 2012 at 6:39 PM, Richard S. Hall <[email protected]wrote:On 12/22/12 20:41 , Jad Naous wrote:Thanks! I was building with java6. Don't know if that is the issue.Anyway, I tested the snapshot, and looks like that fixes it. Any idea when 4.1.0 will be released? Do you know if http://svn.apache.org/viewvc?******view=revision&revision=****1421958<http://svn.apache.org/viewvc?****view=revision&revision=**1421958> <http://svn.apache.org/**viewvc?**view=revision&**revision=1421958<http://svn.apache.org/viewvc?**view=revision&revision=1421958> <http://svn.apache.org/****viewvc?view=revision&revision=****1421958<http://svn.apache.org/**viewvc?view=revision&revision=**1421958> <http://svn.apache.**org/viewvc?view=revision&**revision=1421958<http://svn.apache.org/viewvc?view=revision&revision=1421958>willapply cleanly onto 4.0.3? Otherwise, how stable do you think is 4.1.0-SNAPSHOT? The release version will be 4.2.0, but the 4.1.0-SNAPSHOT build shouldbe reasonably stable. We try to keep trunk stable. I want to try to get a release out soon, but I don't have a specific time table. I'll try to get in done in January if all goes well. -> richard Thanks!Jad. On Fri, Dec 21, 2012 at 6:49 AM, Richard S. Hall < [email protected] wrote:It built fine for me. I'm was building with Java 7.Regardless, I deployed snapshots of framework, main, andmain.distribution, so just grab what you want from the Apache snapshot repo to try it out. -> richard On 12/20/12 18:56 , Jad Naous wrote: It does look like the same issue. Got the trunk/framework. mvn clean install gives:[INFO] --- maven-compiler-plugin:2.3.2:********compile (default-compile) @ org.apache.felix.framework --- Dec 20, 2012 3:54:59 PM org.sonatype.guice.bean.**** reflect.Logs$JULSink warn WARNING: Error injecting: org.apache.maven.plugin.******** CompilerMojo java.lang.********NoClassDefFoundError: org/codehaus/plexus/compiler/********util/scan/****** InclusionScanException Thanks, jad. On Thu, Dec 20, 2012 at 3:22 PM, Richard S. Hall < [email protected] wrote:On 12/20/12 4:10 PM, Jad Naous wrote:If a bundle undergoes a refresh while ipojo is still initializingcomponents, a deadlock can happen. The issue is that if ipojo isattempting to register a service, it will be doing it while synchronizing on the InstanceCreator instance. It will then try to register a service which requires the framework's global lock. Registering a service doesn't require a global, just a bundle lock. I think this could be related to:https://issues.apache.org/**********jira/browse/FELIX-3761<https://issues.apache.org/********jira/browse/FELIX-3761> <htt**ps://issues.apache.org/********jira/browse/FELIX-3761<https://issues.apache.org/******jira/browse/FELIX-3761> <https**://issues.apache.org/******jira/**browse/FELIX-3761<http://issues.apache.org/****jira/**browse/FELIX-3761> <ht**tps://issues.apache.org/******jira/browse/FELIX-3761<https://issues.apache.org/****jira/browse/FELIX-3761> <https:/**/issues.apache.org/******jira/**browse/FELIX-3761<http://issues.apache.org/****jira/**browse/FELIX-3761> <ht**tp://issues.apache.org/**jira/****browse/FELIX-3761<http://issues.apache.org/**jira/**browse/FELIX-3761> <http**s://issues.apache.org/****jira/**browse/FELIX-3761<http://issues.apache.org/**jira/**browse/FELIX-3761> <http**s://issues.apache.org/**jira/**browse/FELIX-3761<https://issues.apache.org/**jira/browse/FELIX-3761> <https:/**/issues.apache.org/******jira/**browse/FELIX-3761<http://issues.apache.org/****jira/**browse/FELIX-3761> <ht**tp://issues.apache.org/**jira/****browse/FELIX-3761<http://issues.apache.org/**jira/**browse/FELIX-3761> <http**://issues.apache.org/**jira/****browse/FELIX-3761<http://issues.apache.org/jira/****browse/FELIX-3761> <htt**p://issues.apache.org/jira/****browse/FELIX-3761<http://issues.apache.org/jira/**browse/FELIX-3761> <https**://issues.apache.org/****jira/**browse/FELIX-3761<http://issues.apache.org/**jira/**browse/FELIX-3761> <http**://issues.apache.org/jira/****browse/FELIX-3761<http://issues.apache.org/jira/**browse/FELIX-3761> <https**://issues.apache.org/**jira/**browse/FELIX-3761<http://issues.apache.org/jira/**browse/FELIX-3761> <https**://issues.apache.org/jira/**browse/FELIX-3761<https://issues.apache.org/jira/browse/FELIX-3761> This avoid grabbing the bundle lock when registering a service, so maybe it will help your situation. You could try to build the framework from trunk and see if it makes a difference. If you aren't able to build from trunk, let me know and I'll try to publish a snapshot build since I don't think we have a recent one (we should do this no matter what). -> richard If a refresh is happening in another thread, the refresh will be holding the framework's global lock, which will then call IPOJO's extender,which then attempts to call a method on InstanceCreator, hence leading to the deadlock. Here are the stack traces: Daemon Thread [Thread-1] (Suspended) Object.wait(long) line: not available [native method] Object[](Object).wait() line: 485 Felix.acquireBundleLock(**********BundleImpl, int) line: 4871 Felix.registerService(**********BundleImpl, String[], Object, Dictionary) line: 3205 BundleContextImpl.**********registerService(String[], Object, Dictionary) line: 346 IPojoContext.registerService(**********String[], Object, Dictionary) line: 385 ProvidedService.**********registerService() line: 362 ProvidedServiceHandler.__M_**********stateChanged(int) line: 509 ProvidedServiceHandler.**********stateChanged(int) line: not available InstanceManager.setState(int) line: 536 InstanceManager.start() line: 418 ComponentFactory.**********createInstance(Dictionary, IPojoContext, HandlerManager[]) line: 179 ComponentFactory(IPojoFactory)**********.** createComponentInstance(****** Dictionary, ServiceContext) line: 310 ComponentFactory(IPojoFactory)**********.** createComponentInstance(****** Dictionary) line: 239 InstanceCreator$**********ManagedInstance.create(********** IPojoFactory) line: 355 InstanceCreator.addInstance(**********Dictionary, long) line: 89 Extender.parse(Bundle, String) line: 306 Extender.startManagementFor(**********Bundle) line: 237 Extender.access$600(Extender, Bundle) line: 52 Extender$CreatorThread.run() line: 769 Thread.run() line: 662 Daemon Thread [FelixFrameworkWiring] (Suspended) InstanceCreator.**********removeInstancesFromBundle(****** ****long) line: 116 Extender.closeManagementFor(**********Bundle) line: 171 Extender.bundleChanged(**********BundleEvent) line: 153 EventDispatcher.**********invokeBundleListenerCallback(***** *****Bundle, EventListener, EventObject) line: 868 EventDispatcher.**********fireEventImmediately(******** EventDispatcher, int, Map, EventObject, Dictionary) line: 789 EventDispatcher.**********fireBundleEvent(BundleEvent, Framework) line: 514 Felix.fireBundleEvent(int, Bundle) line: 4244 Felix.stopBundle(BundleImpl, boolean) line: 2351 Felix$RefreshHelper.stop() line: 4629 Felix.refreshPackages(**********Collection, FrameworkListener[]) line: 3951 FrameworkWiringImpl.run() line: 172 Thread.run() line: 662
