Re: [rules-users] Parallelization in Planner
Hi Nurlan, This feature isn't implemented yet, but will be resolved together with the issue for multi-threading the Solver: https://issues.jboss.org/browse/JBRULES-681 I have a working design in my head and on paper here for the basic. Basically, a Move will be parallelized over CPU's and JVM's. The critical thing that made this hard is not breaking incremental score calculation (delta's), which would make it slower than single-threaded, not faster, but I 've found a way to do that. I 'll look into jgroups to do the JVM communication by default, but I 'll probably make that pluggable. Note: better optimizations algorithms usually beat adding more CPU's - but doing both is even better of course :) Op 13-03-12 05:20, Nurlan schreef: > Hi, Guys! > > I want to know is it possible in drools-planner paralization? > > I mean solving 1 planning problem concurrently in 2 or more jvms > > -- > View this message in context: > http://drools.46999.n3.nabble.com/Paralization-tp3821286p3821286.html > Sent from the Drools: User forum mailing list archive at Nabble.com. > ___ > rules-users mailing list > rules-users@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/rules-users > -- With kind regards, Geoffrey De Smet ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users
Re: [rules-users] Parallelization
Well, I just wanted to report some results... I added the GC switch -server, just to be sure, but anyway, I was having endless trouble with my ThreadPoolExecutor, and switched to: ExecutorService threadPool = Executors.newFixedThreadPool(5); And my claims now process at an unfathomable 5ms each! A week ago, it was 330ms per claim. So, I've managed to get a 66-fold speed increase. Thanks for all your help. I reckon the boss will be happy with this. -- View this message in context: http://drools-java-rules-engine.46999.n3.nabble.com/Parallelization-tp809341p812388.html Sent from the Drools - User mailing list archive at Nabble.com. ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users
Re: [rules-users] Parallelization
If you're using the sun VM multiple Threads will use multiple cores just fine. What version of java? Are you intending to max out the CPU on the machine? If so I suggest this configuration, for an 8 core box: 7 Threads, each running a StatefulKnowledgeSession. Concurrent garbage collection, with VM startup parameters like this: -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:MaxGCPauseMillis=150 -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=2 Here's a good reference for GC parameters: http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html And if you're feeling frisky, try out these to see if it improves performance: -XX:+UseTLAB -XX:+UseSpinning -XX:+UseFastAccessorMethods The default recommendation for maxing out the CPU is to use +1 Threads, but in this case since garbage collection may be an issue I'd reduce that by two. You need one core "dedicated" to other stuff (I/O and whatnot) and some resources for GC. --- On Tue, 5/11/10, djb wrote: > From: djb > Subject: Re: [rules-users] Parallelization > To: rules-users@lists.jboss.org > Date: Tuesday, May 11, 2010, 7:55 AM > > Hi Wolfgang, > > Ok, well I implemented my "option #2", which has cut it > down to 23ms, which > is a good start. My timing is done by taking the time > before, and after, > and dividing by the number of claims processed. (and > averaging over a few > runs) > > I use one thread per StatefulKnowledgeSession... My machine > has 2 cores, but > it will eventually be running on an 8 core beast, so i > reckon this was a > good improvement. I was just worried that I wouldn't > be able to > simultaneously process multiple K-Sessions, but apparently, > Drools doesn't > mind. I'm pretty sure any machine with multiple cores > supports parallel > java threads, no? > > > > - > Regarding my Utilities method, eg. > isWithinTimePeriod("20100308", > "20090405", 1, "Y") > > I can get about 5ms off by commenting out the eval, so it's > not going to be > a big jump even if I fix it, but, well, I am using MMdd > Strings, which > in the method, I sub-stringed, converted to ints, > instantiated DateMidnight > objects, and compared using Joda-time > daysBetween/monthsBetween/yearsBetween > methods. > > My thought was that pre-converting to ints would help, so > that each > ClaimLine has year/month/day int variables, and pass them > in instead. (i.e., > Saves 3 String.substring()'s, and 3 > Integer.parseInt()). but that actually > slowed it down a few milliseconds. (Maybe passing 6 params > instead of 2?!) > > I'm comparing two dates by an arbitrary period, like "2 > days" or "1 month", > and need the framework of the Gregorian Calendar. So, > I don't think I can > do anything about this. 2 months is never guaranteed > to be a set number of > milliseconds. It all depends on the claim date, which > is fact data, and > therefore variable. > > Regards, > Daniel > > -- > View this message in context: > http://drools-java-rules-engine.46999.n3.nabble.com/Parallelization-tp809341p809753.html > Sent from the Drools - User mailing list archive at > Nabble.com. > ___ > rules-users mailing list > rules-users@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/rules-users > ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users
Re: [rules-users] Parallelization
I'm not a guru but I'm pretty certain all modern JVMs support multiple cores well. You will probably make sure you are using the server VM (-server) and not the client VM (assuming you are using the standard VM). Depending on what you current machine is and whether or not you have 2 cpu's or just two cores (and if java can tell the difference) you may or may not currently be running the server vm (see http://java.sun.com/javase/6/docs/technotes/guides/vm/server-class.html) If you aren't then you may be very lucky and find that switching to the server class vm gives you those few extra ms to keep your boss happy. Presumably you have also done the other standard optimization tricks to ensure you have sufficient heap memory etc? Thomas From: rules-users-boun...@lists.jboss.org [mailto:rules-users-boun...@lists.jboss.org] On Behalf Of Wolfgang Laun Sent: 11 May 2010 14:32 To: Rules Users List Subject: Re: [rules-users] Parallelization On Tue, May 11, 2010 at 2:55 PM, djb mailto:dbrownel...@hotmail.com>> wrote: Hi Wolfgang, I use one thread per StatefulKnowledgeSession... My machine has 2 cores, but it will eventually be running on an 8 core beast, so i reckon this was a good improvement. I was just worried that I wouldn't be able to simultaneously process multiple K-Sessions, but apparently, Drools doesn't mind. I'm pretty sure any machine with multiple cores supports parallel java threads, no? This is a Q for a Java guru (which I'm not), and may depend on the JVM and what not. Care to provide details? Probably there's people on this list who might know. (Otherwise, I have a good contact.) - Regarding my Utilities method, eg. isWithinTimePeriod("20100308", "20090405", 1, "Y") I can get about 5ms off by commenting out the eval, so it's not going to be a big jump even if I fix it, but, well, I am using MMdd Strings, which in the method, I sub-stringed, converted to ints, instantiated DateMidnight objects, and compared using Joda-time daysBetween/monthsBetween/yearsBetween methods. My thought was that pre-converting to ints would help, so that each ClaimLine has year/month/day int variables, and pass them in instead. (i.e., Saves 3 String.substring()'s, and 3 Integer.parseInt()). but that actually slowed it down a few milliseconds. (Maybe passing 6 params instead of 2?!) You can treat 20100510 as an integer defining one date; all relations between such numbers would hold as for the "true" dates - you just can't compute durations (days between dates) by simple substraction. I'm comparing two dates by an arbitrary period, like "2 days" or "1 month", and need the framework of the Gregorian Calendar. So, I don't think I can do anything about this. 2 months is never guaranteed to be a set number of milliseconds. It all depends on the claim date, which is fact data, and therefore variable. I'm not sure how the other arguments in the call ("Y", 99) are related to rules or facts, but a repeated test whether one date dx is between d1 and d2 where d2 depends on d1 and a duration would certainly gain from computing d2 once. -W ** This message is confidential and intended only for the addressee. If you have received this message in error, please immediately notify the postmas...@nds.com and delete it from your system as well as any copies. The content of e-mails as well as traffic data may be monitored by NDS for employment and security purposes. To protect the environment please do not print this e-mail unless necessary. NDS Limited. Registered Office: One London Road, Staines, Middlesex, TW18 4EX, United Kingdom. A company registered in England and Wales. Registered no. 3080780. VAT no. GB 603 8808 40-00 ** ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users
Re: [rules-users] Parallelization
Hi Daniel, I was reading the other day that a JVM implementation does not necessarily have to run Java threads in different Processes (taking advantage of multiple cores). If you saw a significant speedup then I would assume your JVM does this. It is worth investigating for your production deployment. I would think that recent JVMs on modern operating systems would support this, but I also wouldn't leave it up to chance. This post seems to imply that the only JVM/OS combinations that don't support native threads are Java 1.2 or Solaris: http://forums.sun.com/thread.jspa?threadID=5330507 About StatefulKnowledgeSessions: You should be able to run these in parallel no problem. -Steve rules-users-boun...@lists.jboss.org wrote on 05/11/2010 07:55:18 AM: > From: > > djb > > To: > > rules-users@lists.jboss.org > > Date: > > 05/11/2010 08:01 AM > > Subject: > > Re: [rules-users] Parallelization > > Sent by: > > rules-users-boun...@lists.jboss.org > > > Hi Wolfgang, > > Ok, well I implemented my "option #2", which has cut it down to 23ms, which > is a good start. My timing is done by taking the time before, and after, > and dividing by the number of claims processed. (and averaging over a few > runs) > > I use one thread per StatefulKnowledgeSession... My machine has 2 cores, but > it will eventually be running on an 8 core beast, so i reckon this was a > good improvement. I was just worried that I wouldn't be able to > simultaneously process multiple K-Sessions, but apparently, Drools doesn't > mind. I'm pretty sure any machine with multiple cores supports parallel > java threads, no? > > > > - > Regarding my Utilities method, eg. isWithinTimePeriod("20100308", > "20090405", 1, "Y") > > I can get about 5ms off by commenting out the eval, so it's not going to be > a big jump even if I fix it, but, well, I am using MMdd Strings, which > in the method, I sub-stringed, converted to ints, instantiated DateMidnight > objects, and compared using Joda-time daysBetween/monthsBetween/yearsBetween > methods. > > My thought was that pre-converting to ints would help, so that each > ClaimLine has year/month/day int variables, and pass them in instead. (i.e., > Saves 3 String.substring()'s, and 3 Integer.parseInt()). but that actually > slowed it down a few milliseconds. (Maybe passing 6 params instead of 2?!) > > I'm comparing two dates by an arbitrary period, like "2 days" or "1 month", > and need the framework of the Gregorian Calendar. So, I don't think I can > do anything about this. 2 months is never guaranteed to be a set number of > milliseconds. It all depends on the claim date, which is fact data, and > therefore variable. > > Regards, > Daniel > > -- > View this message in context: http://drools-java-rules-engine. > 46999.n3.nabble.com/Parallelization-tp809341p809753.html > Sent from the Drools - User mailing list archive at Nabble.com. > ___ > rules-users mailing list > rules-users@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/rules-users ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users
Re: [rules-users] Parallelization
On 11/05/2010 13:55, djb wrote: > Hi Wolfgang, > > Ok, well I implemented my "option #2", which has cut it down to 23ms, which > is a good start. My timing is done by taking the time before, and after, > and dividing by the number of claims processed. (and averaging over a few > runs) > > I use one thread per StatefulKnowledgeSession... My machine has 2 cores, but > it will eventually be running on an 8 core beast, so i reckon this was a > good improvement. I was just worried that I wouldn't be able to > simultaneously process multiple K-Sessions, but apparently, Drools doesn't > mind. I'm pretty sure any machine with multiple cores supports parallel > java threads, no? > multiple ksessions running in different threads all sharing the same kbase is perfectively acceptable, it was designed that way. Mark > > > - > Regarding my Utilities method, eg. isWithinTimePeriod("20100308", > "20090405", 1, "Y") > > I can get about 5ms off by commenting out the eval, so it's not going to be > a big jump even if I fix it, but, well, I am using MMdd Strings, which > in the method, I sub-stringed, converted to ints, instantiated DateMidnight > objects, and compared using Joda-time daysBetween/monthsBetween/yearsBetween > methods. > > My thought was that pre-converting to ints would help, so that each > ClaimLine has year/month/day int variables, and pass them in instead. (i.e., > Saves 3 String.substring()'s, and 3 Integer.parseInt()). but that actually > slowed it down a few milliseconds. (Maybe passing 6 params instead of 2?!) > > I'm comparing two dates by an arbitrary period, like "2 days" or "1 month", > and need the framework of the Gregorian Calendar. So, I don't think I can > do anything about this. 2 months is never guaranteed to be a set number of > milliseconds. It all depends on the claim date, which is fact data, and > therefore variable. > > Regards, > Daniel > > ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users
Re: [rules-users] Parallelization
They are for simple CEP type applications, you won't see a benefit else where, possibly a slow down if you are doing more business type rules. Mark On 11/05/2010 09:40, djb wrote: > Hi Drools squad, > > This is a follow-up to my previous speed-related post. By boss is still > pushing to get 35ms down a bit, and I'm looking at parallelization options. > I've looked through the forums, but not successfully... > > The options I see, are: > > 1. KnowledgeBase partitioning (setting KnowledgeBaseConfiguration to use > multi-threads) > - I tried this, and got the error pasted at the bottom. My suspicion is > that it starts a thread, and meanwhile the Java thread continues, and > disposes of the session before evaluation is complete. > > 2. Creating multiple Java threads, each of which starts its own > KnowledgeSession. > - I started this, but need to confirm that this is possible. What's > happening currently, is that the Java thread continues, and closes my > database connection prematurely, and so, I am working on adding some sort of > counting-semaphore, to wait for all the threads to complete before > continuing the Java thread. > > Should I pursue either of these ideas? I will probably work on the second > today. The other idea I had was to try Sequential Mode, but I don't think > my data is applicable to a StatelessKnowledgeSession. > > Thanks, > Daniel > > > *** > Partition task manager caught an unexpected exception: null > Drools is capturing the exception to avoid thread death. Please report stack > trace to development team. > java.util.concurrent.RejectedExecutionException > at > java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:1760) > at > java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) > at > java.util.concurrent.ThreadPoolExecutor.ensureQueuedTaskHandled(ThreadPoolExecutor.java:758) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:655) > at > org.drools.reteoo.PartitionTaskManager.enqueue(PartitionTaskManager.java:75) > at > org.drools.reteoo.AsyncCompositeObjectSinkAdapter.doPropagateAssertObject(AsyncCompositeObjectSinkAdapter.java:49) > at > org.drools.reteoo.CompositeObjectSinkAdapter.propagateAssertObject(CompositeObjectSinkAdapter.java:344) > at org.drools.reteoo.AlphaNode.assertObject(AlphaNode.java:147) > at > org.drools.reteoo.PartitionTaskManager$FactAssertAction.execute(PartitionTaskManager.java:188) > at > org.drools.reteoo.PartitionTaskManager$PartitionTask.run(PartitionTaskManager.java:112) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:619) > > ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users
Re: [rules-users] Parallelization
On Tue, May 11, 2010 at 2:55 PM, djb wrote: > > Hi Wolfgang, > > I use one thread per StatefulKnowledgeSession... My machine has 2 cores, > but > it will eventually be running on an 8 core beast, so i reckon this was a > good improvement. I was just worried that I wouldn't be able to > simultaneously process multiple K-Sessions, but apparently, Drools doesn't > mind. I'm pretty sure any machine with multiple cores supports parallel > java threads, no? > > This is a Q for a Java guru (which I'm not), and may depend on the JVM and what not. Care to provide details? Probably there's people on this list who might know. (Otherwise, I have a good contact.) > - > Regarding my Utilities method, eg. isWithinTimePeriod("20100308", > "20090405", 1, "Y") > > I can get about 5ms off by commenting out the eval, so it's not going to be > a big jump even if I fix it, but, well, I am using MMdd Strings, which > in the method, I sub-stringed, converted to ints, instantiated DateMidnight > objects, and compared using Joda-time > daysBetween/monthsBetween/yearsBetween > methods. > > My thought was that pre-converting to ints would help, so that each > ClaimLine has year/month/day int variables, and pass them in instead. > (i.e., > Saves 3 String.substring()'s, and 3 Integer.parseInt()). but that actually > slowed it down a few milliseconds. (Maybe passing 6 params instead of 2?!) > You can treat 20100510 as an integer defining one date; all relations between such numbers would hold as for the "true" dates - you just can't compute durations (days between dates) by simple substraction. > I'm comparing two dates by an arbitrary period, like "2 days" or "1 month", > and need the framework of the Gregorian Calendar. So, I don't think I can > do anything about this. 2 months is never guaranteed to be a set number of > milliseconds. It all depends on the claim date, which is fact data, and > therefore variable. > > I'm not sure how the other arguments in the call ("Y", 99) are related to rules or facts, but a repeated test whether one date dx is between d1 and d2 where d2 depends on d1 and a duration would certainly gain from computing d2 once. -W ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users
Re: [rules-users] Parallelization
Hi Wolfgang, Ok, well I implemented my "option #2", which has cut it down to 23ms, which is a good start. My timing is done by taking the time before, and after, and dividing by the number of claims processed. (and averaging over a few runs) I use one thread per StatefulKnowledgeSession... My machine has 2 cores, but it will eventually be running on an 8 core beast, so i reckon this was a good improvement. I was just worried that I wouldn't be able to simultaneously process multiple K-Sessions, but apparently, Drools doesn't mind. I'm pretty sure any machine with multiple cores supports parallel java threads, no? - Regarding my Utilities method, eg. isWithinTimePeriod("20100308", "20090405", 1, "Y") I can get about 5ms off by commenting out the eval, so it's not going to be a big jump even if I fix it, but, well, I am using MMdd Strings, which in the method, I sub-stringed, converted to ints, instantiated DateMidnight objects, and compared using Joda-time daysBetween/monthsBetween/yearsBetween methods. My thought was that pre-converting to ints would help, so that each ClaimLine has year/month/day int variables, and pass them in instead. (i.e., Saves 3 String.substring()'s, and 3 Integer.parseInt()). but that actually slowed it down a few milliseconds. (Maybe passing 6 params instead of 2?!) I'm comparing two dates by an arbitrary period, like "2 days" or "1 month", and need the framework of the Gregorian Calendar. So, I don't think I can do anything about this. 2 months is never guaranteed to be a set number of milliseconds. It all depends on the claim date, which is fact data, and therefore variable. Regards, Daniel -- View this message in context: http://drools-java-rules-engine.46999.n3.nabble.com/Parallelization-tp809341p809753.html Sent from the Drools - User mailing list archive at Nabble.com. ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users
Re: [rules-users] Parallelization
Does your system support parallel execution of Java threads on multiple processors? Otherwise I don't see how parallelization will gain much since the Rete evaluation itself is clearly compile-time bound. How are you timing these 53ms? Does this include input time for your facts? Frequently, much of a program's elapsed time is saved by adopting a better i/o strategy. Also, IIRC, there was the Utitilities method comparing times. Have you looked into possible speed gains there, i.e., what's the type of the "date" attribute and how does that U-method work? -W On Tue, May 11, 2010 at 10:40 AM, djb wrote: > > Hi Drools squad, > > This is a follow-up to my previous speed-related post. By boss is still > pushing to get 35ms down a bit, and I'm looking at parallelization options. > I've looked through the forums, but not successfully... > > The options I see, are: > > 1. KnowledgeBase partitioning (setting KnowledgeBaseConfiguration to use > multi-threads) > - I tried this, and got the error pasted at the bottom. My suspicion is > that it starts a thread, and meanwhile the Java thread continues, and > disposes of the session before evaluation is complete. > > 2. Creating multiple Java threads, each of which starts its own > KnowledgeSession. > - I started this, but need to confirm that this is possible. What's > happening currently, is that the Java thread continues, and closes my > database connection prematurely, and so, I am working on adding some sort > of > counting-semaphore, to wait for all the threads to complete before > continuing the Java thread. > > Should I pursue either of these ideas? I will probably work on the second > today. The other idea I had was to try Sequential Mode, but I don't think > my data is applicable to a StatelessKnowledgeSession. > > Thanks, > Daniel > > > *** > Partition task manager caught an unexpected exception: null > Drools is capturing the exception to avoid thread death. Please report > stack > trace to development team. > java.util.concurrent.RejectedExecutionException >at > > java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:1760) >at > java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) >at > > java.util.concurrent.ThreadPoolExecutor.ensureQueuedTaskHandled(ThreadPoolExecutor.java:758) >at > > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:655) >at > > org.drools.reteoo.PartitionTaskManager.enqueue(PartitionTaskManager.java:75) >at > > org.drools.reteoo.AsyncCompositeObjectSinkAdapter.doPropagateAssertObject(AsyncCompositeObjectSinkAdapter.java:49) >at > > org.drools.reteoo.CompositeObjectSinkAdapter.propagateAssertObject(CompositeObjectSinkAdapter.java:344) >at org.drools.reteoo.AlphaNode.assertObject(AlphaNode.java:147) >at > > org.drools.reteoo.PartitionTaskManager$FactAssertAction.execute(PartitionTaskManager.java:188) >at > > org.drools.reteoo.PartitionTaskManager$PartitionTask.run(PartitionTaskManager.java:112) >at > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >at java.lang.Thread.run(Thread.java:619) > > -- > View this message in context: > http://drools-java-rules-engine.46999.n3.nabble.com/Parallelization-tp809341p809341.html > Sent from the Drools - User mailing list archive at Nabble.com. > ___ > rules-users mailing list > rules-users@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/rules-users > ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users