Hi Pierre, It would indeed be useful to find out more about why your test is hanging. Maybe analysing a threaddump might give some more information?
Cheers, David On 14 May 2015 at 19:54, Pierre De Rop <pierre.de...@gmail.com> wrote: > Thanks David; I just gave a try, and indeed the parallel test passed. I > observed a gain of around 7/10%. The tool is described in [1]. > > But I only have 4 cores on my laptop and I will make more tests in my lab > at work (next week) where we have some servers having 32 or even 128 > processors. This will give a better idea of the gain because the more > processor you have, the more synchronization is costly, so I could possibly > observe a better performance gain. > > Now, I'm sorry but I think that there is still a problem (I don't know > where): when using more threads, the parallel test does not complete and > stops with a timeout message, indicating that the number of expected > components are not created after a timeout delay of 1 minute. > > So, I just committed a modified version of the tool in the sandbox which > can now take a -Dthreads option in order to configure the number of > threads. With -Dthreads=4, its OK. But with -Dthreads=10, then test does > not complete and ends with a timeout: > > $ java -Dthreads=10 -server -jar bin/felix.jar > > g! Starting benchmarks (each tested bundle will add/remove 630 components > during bundle activation). > > [Starting benchmarks with no processing done in components start > methods] > > Benchmarking bundle: > org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel > .................................................Could not start components > timely: current start latch=2, stop latch=630 > > My current understanding of this is that some components are still awaiting > for unsatisfied service dependencies, just like if a service tracker would > have missed a service registration. > > I ran the same test during two hours with the previous framework version, > and did not observe any problems. > > I wonder if someone else do have another tool in order to perform another > kind of load test, just to see if some problems are also observed. > > -> from my side, I will do the following: in the past, the benchmark tool > supported not only dependencymanager, but also Felix SCR and iPojo. So, I > will reintroduce Felix SCR in the benchmark and will check if I also > observe the problem (with -Dthreads=10). > > I will let you know. > > cheers; > /Pierre > > [1] > http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/README > > On Thu, May 14, 2015 at 3:41 PM, David Bosschaert < > david.bosscha...@gmail.com> wrote: > >> I've fixed this now in >> svn.apache.org/viewvc?view=revision&revision=1679367 >> >> Pierre, your loadtest now runs to completion - thanks for reporting >> this issue! I can see that the results for the parallel tests are a >> little bit different than before, but I'm not sure how to read them so >> I'll leave the interpretation of that to you :) >> >> Cheers, >> >> David >> >> On 14 May 2015 at 14:38, David Bosschaert <david.bosscha...@gmail.com> >> wrote: >> > I think I know what this is. I had some additional changes exactly in >> > this area that I simply forgot to apply this morning. I should have it >> > fixed sometime today. >> > >> > Cheers, >> > >> > David >> > >> > On 14 May 2015 at 14:03, David Bosschaert <david.bosscha...@gmail.com> >> wrote: >> >> Hi Pierre, >> >> >> >> I'll take a look today. >> >> >> >> Cheers, >> >> >> >> David >> >> >> >> On 14 May 2015 at 14:00, Pierre De Rop <pierre.de...@gmail.com> wrote: >> >>> I just committed the benchmark tool in >> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/, if you >> can >> >>> take a look. >> >>> >> >>> To run the scenario: >> >>> >> >>> - install jdk8: >> >>> >> >>> [nxuser@nx0012 pderop]$ java -version >> >>> java version "1.8.0_40" >> >>> Java(TM) SE Runtime Environment (build 1.8.0_40-b26) >> >>> Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode) >> >>> >> >>> - checkout the loadtest from >> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/ >> >>> >> >>> - go the the "loadtest" directory and start the test, just like this: >> >>> >> >>> $ java -server -jar bin/felix.jar >> >>> Welcome to Apache Felix Gogo >> >>> >> >>> g! Starting benchmarks (each tested bundle will add/remove 630 >> components >> >>> during bundle activation). >> >>> >> >>> [Starting benchmarks with no processing done in components >> start >> >>> methods] >> >>> >> >>> Benchmarking bundle: >> >>> org.apache.felix.dependencymanager.benchmark.dependencymanager >> >>> .................................................. >> >>> -> results in nanos: [139,129,744 | 143,957,687 | 152,157,581 | >> 319,631,722 >> >>> | 919,838,078] >> >>> >> >>> Benchmarking bundle: >> >>> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel . >> >>> >> >>> >> >>> Here, the first >> >>> "org.apache.felix.dependencymanager.benchmark.dependencymanager" test >> >>> (single-threaded) passes OK. But the next one hangs >> >>> >> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel). >> >>> it uses a fork join pool with size=4. >> >>> >> >>> and when typing "log warn", we see: >> >>> >> >>> "log warn" >> >>> >> >>> 2015.05.14 13:56:10 ERROR - Bundle: >> >>> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel - >> >>> [ForkJoinPool-1-worker-3] Error processing tasks - >> >>> java.util.ConcurrentModificationException >> >>> at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429) >> >>> at java.util.HashMap$KeyIterator.next(HashMap.java:1453) >> >>> at >> java.util.AbstractCollection.addAll(AbstractCollection.java:343) >> >>> at >> >>> >> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245) >> >>> at >> >>> >> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212) >> >>> at >> >>> >> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189) >> >>> at >> >>> >> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269) >> >>> at >> >>> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577) >> >>> at >> >>> >> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655) >> >>> at >> >>> >> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434) >> >>> at >> >>> >> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422) >> >>> at >> >>> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375) >> >>> at >> >>> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319) >> >>> at >> >>> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295) >> >>> at >> >>> >> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226) >> >>> at >> >>> >> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657) >> >>> at >> >>> >> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535) >> >>> at >> >>> >> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492) >> >>> at >> >>> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482) >> >>> at >> >>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227) >> >>> at >> >>> >> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182) >> >>> at >> >>> >> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165) >> >>> at >> >>> >> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402) >> >>> at >> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) >> >>> at >> >>> >> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) >> >>> at >> >>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689) >> >>> at >> >>> >> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) >> >>> >> >>> >> >>> (I will investigate also in my code to check if the problem does not >> come >> >>> from me ?) >> >>> >> >>> cheers; >> >>> /Pierre >> >>> >> >>> >> >>> On Thu, May 14, 2015 at 1:47 PM, Pierre De Rop <pierre.de...@gmail.com >> > >> >>> wrote: >> >>> >> >>>> Hi David, >> >>>> >> >>>> I don't know if it's me (a bug in my benchmark tool) or if if there >> is a >> >>>> regression somewhere in the framework, by my parallel test does not >> pass >> >>>> anymore. >> >>>> >> >>>> The test first starts with a single-threaded scenario, which passes OK >> >>>> (org.apache.felix.dependencymanager.benchmark.dependencymanager), >> then when >> >>>> the parallel test starts >> >>>> >> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel) >> >>>> it suddenly hangs, and when I type "log warn" under the gogo shell, I >> see >> >>>> the following exception: >> >>>> >> >>>> (I'm using java8): >> >>>> >> >>>> $ java -server -Xmx4g -Xms4g -jar bin/felix.jar >> >>>> ____________________________ >> >>>> Welcome to Apache Felix Gogo >> >>>> >> >>>> Benchmarking bundle: >> >>>> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel . >> >>>> >> >>>> (here, the dependencymanager.parallel test hangs and when I type "log >> >>>> warn", I see this:) >> >>>> >> >>>> g! log warn >> >>>> 2015.05.14 13:31:03 ERROR - Bundle: >> >>>> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel - >> >>>> [ForkJoinPool-1-worker-3] Error processing tasks - >> >>>> java.util.ConcurrentModificationException >> >>>> at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429) >> >>>> at java.util.HashMap$KeyIterator.next(HashMap.java:1453) >> >>>> at >> java.util.AbstractCollection.addAll(AbstractCollection.java:343) >> >>>> at >> >>>> >> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245) >> >>>> at >> >>>> >> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212) >> >>>> at >> >>>> >> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189) >> >>>> at >> >>>> >> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269) >> >>>> at >> >>>> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577) >> >>>> at >> >>>> >> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655) >> >>>> at >> >>>> >> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434) >> >>>> at >> >>>> >> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422) >> >>>> at >> >>>> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375) >> >>>> at >> >>>> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319) >> >>>> at >> >>>> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295) >> >>>> at >> >>>> >> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226) >> >>>> at >> >>>> >> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657) >> >>>> at >> >>>> >> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535) >> >>>> at >> >>>> >> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492) >> >>>> at >> >>>> >> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482) >> >>>> at >> >>>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227) >> >>>> at >> >>>> >> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182) >> >>>> at >> >>>> >> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165) >> >>>> at >> >>>> >> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402) >> >>>> at >> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) >> >>>> at >> >>>> >> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) >> >>>> at >> >>>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689) >> >>>> at >> >>>> >> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) >> >>>> >> >>>> (If I configure my threadpool to 1, I have no problems, but with >> >>>> threadpool=4, then I have the problem) >> >>>> >> >>>> I will investigate, but Ideally, may be it would be helpful if you >> could >> >>>> also run the test by yourself; so I will commit soon something to >> reproduce >> >>>> the problem in my sandbox. >> >>>> >> >>>> cheers; >> >>>> /Pierre >> >>>> >> >>>> On Thu, May 14, 2015 at 11:11 AM, David Bosschaert < >> >>>> david.bosscha...@gmail.com> wrote: >> >>>> >> >>>>> I've committed this now in >> >>>>> http://svn.apache.org/viewvc?view=revision&revision=1679327 >> >>>>> >> >>>>> Curious to see what others are measuring. My tests were focused on >> >>>>> multiple bundles/threads obtaining the same service, as that's were I >> >>>>> saw a bit of contention. >> >>>>> >> >>>>> Cheers, >> >>>>> >> >>>>> David >> >>>>> >> >>>>> On 13 May 2015 at 15:10, Pierre De Rop <pierre.de...@gmail.com> >> wrote: >> >>>>> > Hi David, >> >>>>> > >> >>>>> > I'm looking forward to test your improvements using the >> >>>>> dependencymanager >> >>>>> > benchmark tool ([1]). >> >>>>> > >> >>>>> > >> >>>>> > [1] >> >>>>> > >> >>>>> >> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/ >> >>>>> > >> >>>>> > /Pierre >> >>>>> > >> >>>>> > On Wed, May 13, 2015 at 3:02 PM, David Bosschaert < >> >>>>> > david.bosscha...@gmail.com> wrote: >> >>>>> > >> >>>>> >> I have implemented the performance improvements that I was >> thinking of >> >>>>> >> using Java 5 concurrency tools, they can be viewed at [1]. >> >>>>> >> >> >>>>> >> I wrote a little performance test suite [2] that tests >> multithreaded >> >>>>> >> service registry performance (10 threads) from single / multiple >> >>>>> >> bundles with either singleton services and Prototype Service >> Factory >> >>>>> >> services and the results are quite impressive. I'm getting >> performance >> >>>>> >> improvements compared to the current trunk from 8 times better >> than >> >>>>> >> the original (800%) to more than 30 times better (3000%). >> >>>>> >> >> >>>>> >> Carsten has already reviewed the code (thanks Carsten!) and I'm >> >>>>> >> planning to commit it to Felix tomorrow if nobody objects. >> >>>>> >> >> >>>>> >> Cheers, >> >>>>> >> >> >>>>> >> David >> >>>>> >> >> >>>>> >> [1] >> >>>>> >> >> >>>>> >> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450 >> >>>>> >> [2] >> >>>>> >> >> >>>>> >> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf >> >>>>> >> >> >>>>> >> On 23 March 2015 at 15:39, Richard S. Hall <he...@ungoverned.org> >> >>>>> wrote: >> >>>>> >> > On 3/23/15 10:17 , David Bosschaert wrote: >> >>>>> >> >> >> >>>>> >> >> On 23 March 2015 at 13:39, Richard S. Hall < >> he...@ungoverned.org> >> >>>>> >> wrote: >> >>>>> >> >>> >> >>>>> >> >>> On 3/23/15 03:55 , Guillaume Nodet wrote: >> >>>>> >> >>>> >> >>>>> >> >>>> There's a call to interrupt() in Felix#acquireBundleLock(), >> not >> >>>>> sure >> >>>>> >> if >> >>>>> >> >>>> it >> >>>>> >> >>>> can be the culprit though. >> >>>>> >> >>>> Interrupts could also be caused by a bundle being shutdown >> while >> >>>>> one >> >>>>> >> of >> >>>>> >> >>>> its >> >>>>> >> >>>> thread is waiting for a service, which should is a valid use >> case >> >>>>> >> imho. >> >>>>> >> >>>> Anyway, I think sanely reacting to a thread being interrupted >> >>>>> would be >> >>>>> >> >>>> good. >> >>>>> >> >>> >> >>>>> >> >>> >> >>>>> >> >>> Yes, threads can be interrupted if they are holding a bundle >> lock >> >>>>> and >> >>>>> >> the >> >>>>> >> >>> global lock holder needs the bundle lock. >> >>>>> >> >>> >> >>>>> >> >>> I admit that I do not recall why we ignore the interrupt >> here, but >> >>>>> >> didn't >> >>>>> >> >>> we >> >>>>> >> >>> implement service lookup so that a bundle lock wasn't >> necessary? I >> >>>>> >> >>> thought >> >>>>> >> >>> we just checked for the validity of the bundle context before >> >>>>> returning >> >>>>> >> >>> or >> >>>>> >> >>> something. Perhaps we felt there was no reason to be >> interrupted in >> >>>>> >> that >> >>>>> >> >>> case. I really don't know. >> >>>>> >> >> >> >>>>> >> >> I think that the Service Registry could be rewritten to be >> >>>>> completely >> >>>>> >> >> free of synchronized blocks using the Java 5 concurrency >> libraries, >> >>>>> >> > >> >>>>> >> > >> >>>>> >> > Well, that just moves the sync blocks to the library, but yeah >> sure. >> >>>>> >> > >> >>>>> >> >> which I think would really be a better approach. There is too >> much >> >>>>> >> >> locking going on in the current SR implementation IMHO. >> >>>>> >> > >> >>>>> >> > >> >>>>> >> > I don't really think there is too much, but it is complicated. >> >>>>> >> > Unfortunately, it is complicated to make sure that locks aren't >> held >> >>>>> >> while >> >>>>> >> > do service lookups and this is complicated because you can run >> into >> >>>>> >> cycles, >> >>>>> >> > etc. >> >>>>> >> > >> >>>>> >> > But feel free to try to simplify it. >> >>>>> >> > >> >>>>> >> >> >> >>>>> >> >> This brings the question: can we move to Java 5 (or Java 6) >> for the >> >>>>> >> >> Framework codebase? AFAIK we're currently still JDK 1.4 >> compatible >> >>>>> but >> >>>>> >> >> I would be surprised if there is anyone who still needs a JDK >> that >> >>>>> >> >> went end-of-life 7 years ago. >> >>>>> >> > >> >>>>> >> > >> >>>>> >> > At this point, it doesn't really matter to me. >> >>>>> >> > >> >>>>> >> > -> richard >> >>>>> >> > >> >>>>> >> >> >> >>>>> >> >> Best regards, >> >>>>> >> >> >> >>>>> >> >> David >> >>>>> >> > >> >>>>> >> > >> >>>>> >> >> >>>>> >> >>>> >> >>>> >>