RE: [rules-users] Drools 4 poor performance scaling?
hi , please provide more information on the above technique . Let me put what i have understood from the above explanation . u have around 700k - 900k total facts. first u are inserting single fact ie (ColdStarting fact) and retracts it ( how to retract it ?) when all the Jobs and Workers have been loaded.( it means all the other facts are loaded ??) "This changed our startup time from over 50 minutes to under 5. " what time it is ?? "There's some sort of strange propagation and looping going on with accumulation on the fly, at least with our facts and rules." please exaplin more on this ? please provide the url on the wiki where we get more info on this . Thank you . Fenderbosch, Eric wrote: > > FYI for the group. We seem to have solved our performance problem. > > I'll describe our problem space a bit some people have some context. We > load up about 1200 Jobs with about 3000 Stops and about 1500 Vehicles > with about 2000 Workers. We then calculate Scores for each Vehicle for > each Job. Some combinations get excluded for various reasons, but we > end up with 700k - 900k total facts. We do score totaling and sorting > using accumulators. > > One of our teams members (nice find Dan) decided to try to isolate the > accumulation rules until all our other facts are loaded. Those rules > now have a "not ColdStarting()" condition and our startup code inserts a > ColdStarting fact as the first fact and retracts it when all the Jobs > and Workers have been loaded. This changed our startup time from over > 50 minutes to under 5. There's some sort of strange propagation and > looping going on with accumulation on the fly, at least with our facts > and rules. > > I'll put an entry on the wiki as well. > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Fenderbosch, > Eric > Sent: Monday, June 30, 2008 11:46 AM > To: Rules Users List > Subject: RE: [rules-users] Drools 4 poor performance scaling? > > We are having a similar problem, although our fact count is much higher. > Performance seems pretty good and consistent until about 400k facts, > then performance degrades significantly. Part of the degradation is > from bigger and more frequent GCs, but not all of it. > > Time to load first 100k facts: ~1 min > Time to load next 100k facts: ~1 min > Time to load next 100k facts: ~2 min > Time to load next 100k facts: ~4 min > > This trend continues, going from 600k to 700k facts takes over 7 > minutes. We're running 4.0.7 on a 4 CPU box with 12 GB, 64 bit RH Linux > and 64 bit JRockit 5. We've allocated a 9 GB heap for the VM using > large pages, so no memory paging is happening. JRockit is started w/ > the -XXagressive parameter, which enables large pages and the more > efficient hash function in HashMap which was introduced in Java5 update > 8. > > http://e-docs.bea.com/jrockit/jrdocs/refman/optionXX.html > > The end state is over 700k facts, with the possibility of nearly 1M > facts in production. After end state is reached and we issue a few GC > requests, if looks like our memory per fact is almost 9k, which seems > quite high as most of the facts are very simple. Could that be due to > our liberal use of insertLogical and TMS? > > We've tried performing a "commit" every few hundred fact insertions by > issuing a fireAllRules periodically, and that seems to have helped > marginally. > > I tried disabling shadow proxies and a few of our ~390 test cases fail > and one loops indefinitely. I'm pretty sure we could fix those, but > don't want to bother if this isn't a realistic solution. > > Any thoughts? > > Thanks > > Eric > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Ron Kneusel > Sent: Thursday, June 26, 2008 12:47 PM > To: rules-users@lists.jboss.org > Subject: [rules-users] Drools 4 poor performance scaling? > > > I am testing Drools 4 for our application and while sequential mode is > very fast I get very poor scaling when I increase the number of facts > for stateful or stateless sessions. I want to make sure I'm not doing > something foolish before deciding on whether or not to use Drools > because from what I am reading online it should be fast with the number > of facts I have. > > The scenario: I have 1000 rules in a DRL file. They are all of the > form: > > rule rule > when > Data(type == 0, value> 0.185264); > Data(type == 3, value < 0.198202); > then > insert(new AlarmRaised(0)); > warnings.setAlarm(0, true)
Re: [rules-users] Drools 4 poor performance scaling?
It's more related to the issues with accumulate. At the moment ever single propagation executes both the action and the result and propagates the result to the next pattern. When dataloading this means you are constantly evaluating the result and the rest of the rule for each propagation. When in realty what you want to do is load the accumulate and execute the result and propagate at the end - however trying to find a robust way of doing this isn't easy. Mark Anstis, Michael (M.) wrote: I wonder whether is's a "benefit" of truth maintenance? If a new fact is inserted into working memory that could cause an activation of a rule that contains an accumulate (or collect) to change then the whole accumulate (or collect) operator is executed again?!? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Fenderbosch, Eric Sent: 09 July 2008 14:41 To: Rules Users List Subject: RE: [rules-users] Drools 4 poor performance scaling? FYI for the group. We seem to have solved our performance problem. I'll describe our problem space a bit some people have some context. We load up about 1200 Jobs with about 3000 Stops and about 1500 Vehicles with about 2000 Workers. We then calculate Scores for each Vehicle for each Job. Some combinations get excluded for various reasons, but we end up with 700k - 900k total facts. We do score totaling and sorting using accumulators. One of our teams members (nice find Dan) decided to try to isolate the accumulation rules until all our other facts are loaded. Those rules now have a "not ColdStarting()" condition and our startup code inserts a ColdStarting fact as the first fact and retracts it when all the Jobs and Workers have been loaded. This changed our startup time from over 50 minutes to under 5. There's some sort of strange propagation and looping going on with accumulation on the fly, at least with our facts and rules. I'll put an entry on the wiki as well. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Fenderbosch, Eric Sent: Monday, June 30, 2008 11:46 AM To: Rules Users List Subject: RE: [rules-users] Drools 4 poor performance scaling? We are having a similar problem, although our fact count is much higher. Performance seems pretty good and consistent until about 400k facts, then performance degrades significantly. Part of the degradation is from bigger and more frequent GCs, but not all of it. Time to load first 100k facts: ~1 min Time to load next 100k facts: ~1 min Time to load next 100k facts: ~2 min Time to load next 100k facts: ~4 min This trend continues, going from 600k to 700k facts takes over 7 minutes. We're running 4.0.7 on a 4 CPU box with 12 GB, 64 bit RH Linux and 64 bit JRockit 5. We've allocated a 9 GB heap for the VM using large pages, so no memory paging is happening. JRockit is started w/ the -XXagressive parameter, which enables large pages and the more efficient hash function in HashMap which was introduced in Java5 update 8. http://e-docs.bea.com/jrockit/jrdocs/refman/optionXX.html The end state is over 700k facts, with the possibility of nearly 1M facts in production. After end state is reached and we issue a few GC requests, if looks like our memory per fact is almost 9k, which seems quite high as most of the facts are very simple. Could that be due to our liberal use of insertLogical and TMS? We've tried performing a "commit" every few hundred fact insertions by issuing a fireAllRules periodically, and that seems to have helped marginally. I tried disabling shadow proxies and a few of our ~390 test cases fail and one loops indefinitely. I'm pretty sure we could fix those, but don't want to bother if this isn't a realistic solution. Any thoughts? Thanks Eric -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ron Kneusel Sent: Thursday, June 26, 2008 12:47 PM To: rules-users@lists.jboss.org Subject: [rules-users] Drools 4 poor performance scaling? I am testing Drools 4 for our application and while sequential mode is very fast I get very poor scaling when I increase the number of facts for stateful or stateless sessions. I want to make sure I'm not doing something foolish before deciding on whether or not to use Drools because from what I am reading online it should be fast with the number of facts I have. The scenario: I have 1000 rules in a DRL file. They are all of the form: rule rule when Data(type == 0, value> 0.185264); Data(type == 3, value < 0.198202); then insert(new AlarmRaised(0)); warnings.setAlarm(0, true); end where the ranges checked on the values and the types are randomly generated. Then, I create a Stateful session and run in a loop timing how long it takes the engine to fire all rules as the number of
RE: [rules-users] Drools 4 poor performance scaling?
I wonder whether is's a "benefit" of truth maintenance? If a new fact is inserted into working memory that could cause an activation of a rule that contains an accumulate (or collect) to change then the whole accumulate (or collect) operator is executed again?!? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Fenderbosch, Eric Sent: 09 July 2008 14:41 To: Rules Users List Subject: RE: [rules-users] Drools 4 poor performance scaling? FYI for the group. We seem to have solved our performance problem. I'll describe our problem space a bit some people have some context. We load up about 1200 Jobs with about 3000 Stops and about 1500 Vehicles with about 2000 Workers. We then calculate Scores for each Vehicle for each Job. Some combinations get excluded for various reasons, but we end up with 700k - 900k total facts. We do score totaling and sorting using accumulators. One of our teams members (nice find Dan) decided to try to isolate the accumulation rules until all our other facts are loaded. Those rules now have a "not ColdStarting()" condition and our startup code inserts a ColdStarting fact as the first fact and retracts it when all the Jobs and Workers have been loaded. This changed our startup time from over 50 minutes to under 5. There's some sort of strange propagation and looping going on with accumulation on the fly, at least with our facts and rules. I'll put an entry on the wiki as well. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Fenderbosch, Eric Sent: Monday, June 30, 2008 11:46 AM To: Rules Users List Subject: RE: [rules-users] Drools 4 poor performance scaling? We are having a similar problem, although our fact count is much higher. Performance seems pretty good and consistent until about 400k facts, then performance degrades significantly. Part of the degradation is from bigger and more frequent GCs, but not all of it. Time to load first 100k facts: ~1 min Time to load next 100k facts: ~1 min Time to load next 100k facts: ~2 min Time to load next 100k facts: ~4 min This trend continues, going from 600k to 700k facts takes over 7 minutes. We're running 4.0.7 on a 4 CPU box with 12 GB, 64 bit RH Linux and 64 bit JRockit 5. We've allocated a 9 GB heap for the VM using large pages, so no memory paging is happening. JRockit is started w/ the -XXagressive parameter, which enables large pages and the more efficient hash function in HashMap which was introduced in Java5 update 8. http://e-docs.bea.com/jrockit/jrdocs/refman/optionXX.html The end state is over 700k facts, with the possibility of nearly 1M facts in production. After end state is reached and we issue a few GC requests, if looks like our memory per fact is almost 9k, which seems quite high as most of the facts are very simple. Could that be due to our liberal use of insertLogical and TMS? We've tried performing a "commit" every few hundred fact insertions by issuing a fireAllRules periodically, and that seems to have helped marginally. I tried disabling shadow proxies and a few of our ~390 test cases fail and one loops indefinitely. I'm pretty sure we could fix those, but don't want to bother if this isn't a realistic solution. Any thoughts? Thanks Eric -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ron Kneusel Sent: Thursday, June 26, 2008 12:47 PM To: rules-users@lists.jboss.org Subject: [rules-users] Drools 4 poor performance scaling? I am testing Drools 4 for our application and while sequential mode is very fast I get very poor scaling when I increase the number of facts for stateful or stateless sessions. I want to make sure I'm not doing something foolish before deciding on whether or not to use Drools because from what I am reading online it should be fast with the number of facts I have. The scenario: I have 1000 rules in a DRL file. They are all of the form: rule rule when Data(type == 0, value> 0.185264); Data(type == 3, value < 0.198202); then insert(new AlarmRaised(0)); warnings.setAlarm(0, true); end where the ranges checked on the values and the types are randomly generated. Then, I create a Stateful session and run in a loop timing how long it takes the engine to fire all rules as the number of inserted facts increases: // Run for(j=0; j < 100; j+=5) { if (j==0) { nfacts = 1; } else { nfacts = j; } System.out.println(nfacts + ":"); // Get a working memory StatefulSession wm = ruleBase.newStatefulSession(); // Global - output warnings = new Alarm(); wm.setGlobal("warnings", warnings); // Add facts st =
RE: [rules-users] Drools 4 poor performance scaling?
FYI for the group. We seem to have solved our performance problem. I'll describe our problem space a bit some people have some context. We load up about 1200 Jobs with about 3000 Stops and about 1500 Vehicles with about 2000 Workers. We then calculate Scores for each Vehicle for each Job. Some combinations get excluded for various reasons, but we end up with 700k - 900k total facts. We do score totaling and sorting using accumulators. One of our teams members (nice find Dan) decided to try to isolate the accumulation rules until all our other facts are loaded. Those rules now have a "not ColdStarting()" condition and our startup code inserts a ColdStarting fact as the first fact and retracts it when all the Jobs and Workers have been loaded. This changed our startup time from over 50 minutes to under 5. There's some sort of strange propagation and looping going on with accumulation on the fly, at least with our facts and rules. I'll put an entry on the wiki as well. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Fenderbosch, Eric Sent: Monday, June 30, 2008 11:46 AM To: Rules Users List Subject: RE: [rules-users] Drools 4 poor performance scaling? We are having a similar problem, although our fact count is much higher. Performance seems pretty good and consistent until about 400k facts, then performance degrades significantly. Part of the degradation is from bigger and more frequent GCs, but not all of it. Time to load first 100k facts: ~1 min Time to load next 100k facts: ~1 min Time to load next 100k facts: ~2 min Time to load next 100k facts: ~4 min This trend continues, going from 600k to 700k facts takes over 7 minutes. We're running 4.0.7 on a 4 CPU box with 12 GB, 64 bit RH Linux and 64 bit JRockit 5. We've allocated a 9 GB heap for the VM using large pages, so no memory paging is happening. JRockit is started w/ the -XXagressive parameter, which enables large pages and the more efficient hash function in HashMap which was introduced in Java5 update 8. http://e-docs.bea.com/jrockit/jrdocs/refman/optionXX.html The end state is over 700k facts, with the possibility of nearly 1M facts in production. After end state is reached and we issue a few GC requests, if looks like our memory per fact is almost 9k, which seems quite high as most of the facts are very simple. Could that be due to our liberal use of insertLogical and TMS? We've tried performing a "commit" every few hundred fact insertions by issuing a fireAllRules periodically, and that seems to have helped marginally. I tried disabling shadow proxies and a few of our ~390 test cases fail and one loops indefinitely. I'm pretty sure we could fix those, but don't want to bother if this isn't a realistic solution. Any thoughts? Thanks Eric -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ron Kneusel Sent: Thursday, June 26, 2008 12:47 PM To: rules-users@lists.jboss.org Subject: [rules-users] Drools 4 poor performance scaling? I am testing Drools 4 for our application and while sequential mode is very fast I get very poor scaling when I increase the number of facts for stateful or stateless sessions. I want to make sure I'm not doing something foolish before deciding on whether or not to use Drools because from what I am reading online it should be fast with the number of facts I have. The scenario: I have 1000 rules in a DRL file. They are all of the form: rule rule when Data(type == 0, value> 0.185264); Data(type == 3, value < 0.198202); then insert(new AlarmRaised(0)); warnings.setAlarm(0, true); end where the ranges checked on the values and the types are randomly generated. Then, I create a Stateful session and run in a loop timing how long it takes the engine to fire all rules as the number of inserted facts increases: // Run for(j=0; j < 100; j+=5) { if (j==0) { nfacts = 1; } else { nfacts = j; } System.out.println(nfacts + ":"); // Get a working memory StatefulSession wm = ruleBase.newStatefulSession(); // Global - output warnings = new Alarm(); wm.setGlobal("warnings", warnings); // Add facts st = (new Date()).getTime(); for(i=0; i < nfacts; i++) { wm.insert(new Data(rand.nextInt(4), rand.nextDouble()-0.5)); } en = (new Date()).getTime(); System.out.println("facts = " + (en-st)); // Now run the rules st = (new Date()).getTime(); wm.fireAllRules(); en = (new Date()).getTime(); System.out.println("rules = " + (en-st));
RE: [rules-users] Drools 4 poor performance scaling?
A small update. I ran the example below starting with 90 rules and running down to 1. The execution time remains the same so I don't think the problem has to do with something left over from the previous call to fireAllRules(). This really should be a simple example and I am still at a loss as to why it doesn't work quickly. Again, any help appreciated! Ron >> -Original Message- >> From: [EMAIL PROTECTED] >> [mailto:[EMAIL PROTECTED] On Behalf Of Ron Kneusel >> Sent: Thursday, June 26, 2008 12:47 PM >> To: rules-users@lists.jboss.org >> Subject: [rules-users] Drools 4 poor performance scaling? >> >> >> I am testing Drools 4 for our application and while sequential mode is > >> very fast I get very poor scaling when I increase the number of facts >> for stateful or stateless sessions. I want to make sure I'm not doing > >> something foolish before deciding on whether or not to use Drools >> because from what I am reading online it should be fast with the >> number of facts I have. >> >> The scenario: I have 1000 rules in a DRL file. They are all of the >> form: >> >> rule rule >> when >> Data(type == 0, value> 0.185264); >> Data(type == 3, value < 0.198202); >> then >> insert(new AlarmRaised(0)); >> warnings.setAlarm(0, true); >> end >> >> where the ranges checked on the values and the types are randomly >> generated. Then, I create a Stateful session and run in a loop timing > >> how long it takes the engine to fire all rules as the number of >> inserted facts increases: >> >> // Run >> for(j=0; j < 100; j+=5) { >> >> if (j==0) { >> nfacts = 1; >> } else { >> nfacts = j; >> } >> >> System.out.println(nfacts + ":"); >> >> // Get a working memory >> StatefulSession wm = ruleBase.newStatefulSession(); >> >> // Global - output >> warnings = new Alarm(); >> wm.setGlobal("warnings", warnings); >> >> // Add facts >> st = (new Date()).getTime(); >> for(i=0; i < nfacts; i++) { >> wm.insert(new Data(rand.nextInt(4), >> rand.nextDouble()-0.5)); >> } >> en = (new Date()).getTime(); >> System.out.println("facts = " + (en-st)); >> >> // Now run the rules >> st = (new Date()).getTime(); >> wm.fireAllRules(); >> en = (new Date()).getTime(); >> System.out.println("rules = " + (en-st)); >> >> // Clean up >> wm.dispose(); >> >> System.out.println("\n"); >> } >> >> This code is based on the HelloWorldExample.java code from the manual >> and the setup for the rule base is the same as in the manual. As the >> number of facts increases runtime increases dramatically: >> >> facts -- runtime (ms) >> 10 -- 168 >> 20 -- 166 >> 30 -- 344 >> 40 -- 587 >> 50 -- 1215 >> 60 -- 1931 >> 70 -- 2262 >> 80 -- 3000 >> 90 -- 4754 >> >> with a maximum memory use of about 428 MB RAM. By contrast, if I use >> sequential stateless sessions, everything runs in about 1-5 ms. >> >> Is there something in my set up that would cause this, or is this how >> one would expect Drools to scale? I read about people using thousands > >> of facts so I suspect I'm setting something up incorrectly. >> >> Any help appreciated! >> >> Ron _ Need to know now? Get instant answers with Windows Live Messenger. http://www.windowslive.com/messenger/connect_your_way.html?ocid=TXT_TAGLM_WL_Refresh_messenger_062008 ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users
RE: [rules-users] Drools 4 poor performance scaling?
I'm in IRC now. The non-business sensitive test case hasn't been maintained. At this stage it might be pretty difficult to create one that doesn't have proprietary information and still functions anywhere the same. We've got nearly 200 rules and 20 different kinds of facts. I wonder if a simple obfuscation would be sufficient? I did give 5.0M1 a try last week. Several of our rules wouldn't compile. I tried for a day or so to fix things, but then gave up. We know it is non-optimal, but we have a few rules with "if" statements in the RHS and those simply wouldn't compile in 5.0. I'd like to refactor those out to at least an "eval" in the LHS, but ideally I'd like to precompute the statement and store the result in a new fact so that it could be indexed. Is 5.0 better for multi-threaded access as we discussed before? We've had to wrap all access to working memory in synchronized blocks when using 4.x. That's a pretty big hammer, but it works. Otherwise fact insertions/retracts, firing of rules and queries end up getting run at the same time by different threads and working memory ends up completely unusable. Maybe I'll take another stab at fixing those rules and give 5.0 another go. Any target on a 5.0 release date? We're looking to go live in production in about 1 month. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mark Proctor Sent: Monday, June 30, 2008 12:39 PM To: Rules Users List Subject: Re: [rules-users] Drools 4 poor performance scaling? Fenderbosch, Eric wrote: > We are having a similar problem, although our fact count is much higher. > Performance seems pretty good and consistent until about 400k facts, > then performance degrades significantly. Part of the degradation is > from bigger and more frequent GCs, but not all of it. > If you have multi-cpu there is a JVM command you can set a dedicated cpu for GC, that helps somewhat. > Time to load first 100k facts: ~1 min > Time to load next 100k facts: ~1 min > Time to load next 100k facts: ~2 min > Time to load next 100k facts: ~4 min > > This trend continues, going from 600k to 700k facts takes over 7 > minutes. We're running 4.0.7 on a 4 CPU box with 12 GB, 64 bit RH > Linux and 64 bit JRockit 5. We've allocated a 9 GB heap for the VM > using large pages, so no memory paging is happening. JRockit is > started w/ the -XXagressive parameter, which enables large pages and > the more efficient hash function in HashMap which was introduced in > Java5 update 8. > Other than the CPU thing, Drools won't take advantage of multipe cpus at the moment. > http://e-docs.bea.com/jrockit/jrdocs/refman/optionXX.html > > The end state is over 700k facts, with the possibility of nearly 1M > facts in production. After end state is reached and we issue a few GC > requests, if looks like our memory per fact is almost 9k, which seems > quite high as most of the facts are very simple. Could that be due to > our liberal use of insertLogical and TMS? > It could be related to this, especially if you create a long chain of logical relationships. > We've tried performing a "commit" every few hundred fact insertions by > issuing a fireAllRules periodically, and that seems to have helped > marginally. > > I tried disabling shadow proxies and a few of our ~390 test cases fail > and one loops indefinitely. I'm pretty sure we could fix those, but > don't want to bother if this isn't a realistic solution. > > Any thoughts? > Have you tried this on Drools 5.0? It 'doesn't need shadow proxies and implements a new Rete algorithm that is faster for retracts. You can get a nightly build from here, I'd be interested to find out how broken 5.0 is :) https://hudson.jboss.org/hudson/job/drools/lastSuccessfulBuild/artifact/ trunk/target/ We still have more performnace work to do, the items are known, just a matter of time, not all will make 5.0 though. but the main items include: 1) bytecode compiled Rete network, instead of interpreted nodes. I'm hoping this will have a large impact, reducing GC and general indirection and recursive method call frames. 2) "true modify", instead of a retract+assert, will also remove the need for activation normalistaion that we do for TMS and the agenda event model. 3) range indexing (initially literals, but would like to explore variables too). Steve, before he left fedex, was creating a simulator for this use case, but removing anything business sensitive. So that we could use it publicly as a benchmark and to help us tune the engine. Are you still working on this? Steve use to chat to us on irc, can I ask you to pop on for a chat? http://labs.jboss.org/drools/irc.html mark > Thanks > > Eri
RE: [rules-users] Drools 4 poor performance scaling?
Eric- Mark has my test code and said he'll look at it in a few days. I'm sure he'll reply here but if not, I'll post it. My question for you is how quickly does fireAllRules() run for you? As I said, with only 95 facts loaded it is taking my machine (dual core 2 GHz box, Fedora Core 6) almost 5 seconds. Ron -------- > Subject: RE: [rules-users] Drools 4 poor performance scaling? > Date: Mon, 30 Jun 2008 11:46:15 -0400 > From: [EMAIL PROTECTED] > To: rules-users@lists.jboss.org > > We are having a similar problem, although our fact count is much higher. > Performance seems pretty good and consistent until about 400k facts, > then performance degrades significantly. Part of the degradation is > from bigger and more frequent GCs, but not all of it. > > Time to load first 100k facts: ~1 min > Time to load next 100k facts: ~1 min > Time to load next 100k facts: ~2 min > Time to load next 100k facts: ~4 min > > This trend continues, going from 600k to 700k facts takes over 7 > minutes. We're running 4.0.7 on a 4 CPU box with 12 GB, 64 bit RH Linux > and 64 bit JRockit 5. We've allocated a 9 GB heap for the VM using > large pages, so no memory paging is happening. JRockit is started w/ > the -XXagressive parameter, which enables large pages and the more > efficient hash function in HashMap which was introduced in Java5 update > 8. > > http://e-docs.bea.com/jrockit/jrdocs/refman/optionXX.html > > The end state is over 700k facts, with the possibility of nearly 1M > facts in production. After end state is reached and we issue a few GC > requests, if looks like our memory per fact is almost 9k, which seems > quite high as most of the facts are very simple. Could that be due to > our liberal use of insertLogical and TMS? > > We've tried performing a "commit" every few hundred fact insertions by > issuing a fireAllRules periodically, and that seems to have helped > marginally. > > I tried disabling shadow proxies and a few of our ~390 test cases fail > and one loops indefinitely. I'm pretty sure we could fix those, but > don't want to bother if this isn't a realistic solution. > > Any thoughts? > > Thanks > > Eric > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Ron Kneusel > Sent: Thursday, June 26, 2008 12:47 PM > To: rules-users@lists.jboss.org > Subject: [rules-users] Drools 4 poor performance scaling? > > > I am testing Drools 4 for our application and while sequential mode is > very fast I get very poor scaling when I increase the number of facts > for stateful or stateless sessions. I want to make sure I'm not doing > something foolish before deciding on whether or not to use Drools > because from what I am reading online it should be fast with the number > of facts I have. > > The scenario: I have 1000 rules in a DRL file. They are all of the > form: > > rule rule > when > Data(type == 0, value> 0.185264); > Data(type == 3, value < 0.198202); > then > insert(new AlarmRaised(0)); > warnings.setAlarm(0, true); > end > > where the ranges checked on the values and the types are randomly > generated. Then, I create a Stateful session and run in a loop timing > how long it takes the engine to fire all rules as the number of inserted > facts increases: > > // Run > for(j=0; j < 100; j+=5) { > > if (j==0) { > nfacts = 1; > } else { > nfacts = j; > } > > System.out.println(nfacts + ":"); > > // Get a working memory > StatefulSession wm = ruleBase.newStatefulSession(); > > // Global - output > warnings = new Alarm(); > wm.setGlobal("warnings", warnings); > > // Add facts > st = (new Date()).getTime(); > for(i=0; i < nfacts; i++) { > wm.insert(new Data(rand.nextInt(4), > rand.nextDouble()-0.5)); > } > en = (new Date()).getTime(); > System.out.println("facts = " + (en-st)); > > // Now run the rules > st = (new Date()).getTime(); > wm.fireAllRules(); > en = (new Date()).getTime(); > System.out.println("rules = " + (en-st)); > > // Clean up > wm.dispose(); > > System.out.println("\n"); > } > > This code is based on the Hello
Re: [rules-users] Drools 4 poor performance scaling?
Fenderbosch, Eric wrote: We are having a similar problem, although our fact count is much higher. Performance seems pretty good and consistent until about 400k facts, then performance degrades significantly. Part of the degradation is from bigger and more frequent GCs, but not all of it. If you have multi-cpu there is a JVM command you can set a dedicated cpu for GC, that helps somewhat. Time to load first 100k facts: ~1 min Time to load next 100k facts: ~1 min Time to load next 100k facts: ~2 min Time to load next 100k facts: ~4 min This trend continues, going from 600k to 700k facts takes over 7 minutes. We're running 4.0.7 on a 4 CPU box with 12 GB, 64 bit RH Linux and 64 bit JRockit 5. We've allocated a 9 GB heap for the VM using large pages, so no memory paging is happening. JRockit is started w/ the -XXagressive parameter, which enables large pages and the more efficient hash function in HashMap which was introduced in Java5 update 8. Other than the CPU thing, Drools won't take advantage of multipe cpus at the moment. http://e-docs.bea.com/jrockit/jrdocs/refman/optionXX.html The end state is over 700k facts, with the possibility of nearly 1M facts in production. After end state is reached and we issue a few GC requests, if looks like our memory per fact is almost 9k, which seems quite high as most of the facts are very simple. Could that be due to our liberal use of insertLogical and TMS? It could be related to this, especially if you create a long chain of logical relationships. We've tried performing a "commit" every few hundred fact insertions by issuing a fireAllRules periodically, and that seems to have helped marginally. I tried disabling shadow proxies and a few of our ~390 test cases fail and one loops indefinitely. I'm pretty sure we could fix those, but don't want to bother if this isn't a realistic solution. Any thoughts? Have you tried this on Drools 5.0? It 'doesn't need shadow proxies and implements a new Rete algorithm that is faster for retracts. You can get a nightly build from here, I'd be interested to find out how broken 5.0 is :) https://hudson.jboss.org/hudson/job/drools/lastSuccessfulBuild/artifact/trunk/target/ We still have more performnace work to do, the items are known, just a matter of time, not all will make 5.0 though. but the main items include: 1) bytecode compiled Rete network, instead of interpreted nodes. I'm hoping this will have a large impact, reducing GC and general indirection and recursive method call frames. 2) "true modify", instead of a retract+assert, will also remove the need for activation normalistaion that we do for TMS and the agenda event model. 3) range indexing (initially literals, but would like to explore variables too). Steve, before he left fedex, was creating a simulator for this use case, but removing anything business sensitive. So that we could use it publicly as a benchmark and to help us tune the engine. Are you still working on this? Steve use to chat to us on irc, can I ask you to pop on for a chat? http://labs.jboss.org/drools/irc.html mark Thanks Eric -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ron Kneusel Sent: Thursday, June 26, 2008 12:47 PM To: rules-users@lists.jboss.org Subject: [rules-users] Drools 4 poor performance scaling? I am testing Drools 4 for our application and while sequential mode is very fast I get very poor scaling when I increase the number of facts for stateful or stateless sessions. I want to make sure I'm not doing something foolish before deciding on whether or not to use Drools because from what I am reading online it should be fast with the number of facts I have. The scenario: I have 1000 rules in a DRL file. They are all of the form: rule rule when Data(type == 0, value> 0.185264); Data(type == 3, value < 0.198202); then insert(new AlarmRaised(0)); warnings.setAlarm(0, true); end where the ranges checked on the values and the types are randomly generated. Then, I create a Stateful session and run in a loop timing how long it takes the engine to fire all rules as the number of inserted facts increases: // Run for(j=0; j < 100; j+=5) { if (j==0) { nfacts = 1; } else { nfacts = j; } System.out.println(nfacts + ":"); // Get a working memory StatefulSession wm = ruleBase.newStatefulSession(); // Global - output warnings = new Alarm(); wm.setGlobal("warnings", warnings); // Add facts st = (new Date()).getTime(); for(i=0; i < nfacts; i++) { wm.insert(new Data(rand.nextInt(4), rand.nextDouble()-0.5)); } en = (new Date()).getTime(); System.out.println("facts = " + (en-st)); // Now run the rul
RE: [rules-users] Drools 4 poor performance scaling?
We are having a similar problem, although our fact count is much higher. Performance seems pretty good and consistent until about 400k facts, then performance degrades significantly. Part of the degradation is from bigger and more frequent GCs, but not all of it. Time to load first 100k facts: ~1 min Time to load next 100k facts: ~1 min Time to load next 100k facts: ~2 min Time to load next 100k facts: ~4 min This trend continues, going from 600k to 700k facts takes over 7 minutes. We're running 4.0.7 on a 4 CPU box with 12 GB, 64 bit RH Linux and 64 bit JRockit 5. We've allocated a 9 GB heap for the VM using large pages, so no memory paging is happening. JRockit is started w/ the -XXagressive parameter, which enables large pages and the more efficient hash function in HashMap which was introduced in Java5 update 8. http://e-docs.bea.com/jrockit/jrdocs/refman/optionXX.html The end state is over 700k facts, with the possibility of nearly 1M facts in production. After end state is reached and we issue a few GC requests, if looks like our memory per fact is almost 9k, which seems quite high as most of the facts are very simple. Could that be due to our liberal use of insertLogical and TMS? We've tried performing a "commit" every few hundred fact insertions by issuing a fireAllRules periodically, and that seems to have helped marginally. I tried disabling shadow proxies and a few of our ~390 test cases fail and one loops indefinitely. I'm pretty sure we could fix those, but don't want to bother if this isn't a realistic solution. Any thoughts? Thanks Eric -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ron Kneusel Sent: Thursday, June 26, 2008 12:47 PM To: rules-users@lists.jboss.org Subject: [rules-users] Drools 4 poor performance scaling? I am testing Drools 4 for our application and while sequential mode is very fast I get very poor scaling when I increase the number of facts for stateful or stateless sessions. I want to make sure I'm not doing something foolish before deciding on whether or not to use Drools because from what I am reading online it should be fast with the number of facts I have. The scenario: I have 1000 rules in a DRL file. They are all of the form: rule rule when Data(type == 0, value> 0.185264); Data(type == 3, value < 0.198202); then insert(new AlarmRaised(0)); warnings.setAlarm(0, true); end where the ranges checked on the values and the types are randomly generated. Then, I create a Stateful session and run in a loop timing how long it takes the engine to fire all rules as the number of inserted facts increases: // Run for(j=0; j < 100; j+=5) { if (j==0) { nfacts = 1; } else { nfacts = j; } System.out.println(nfacts + ":"); // Get a working memory StatefulSession wm = ruleBase.newStatefulSession(); // Global - output warnings = new Alarm(); wm.setGlobal("warnings", warnings); // Add facts st = (new Date()).getTime(); for(i=0; i < nfacts; i++) { wm.insert(new Data(rand.nextInt(4), rand.nextDouble()-0.5)); } en = (new Date()).getTime(); System.out.println("facts = " + (en-st)); // Now run the rules st = (new Date()).getTime(); wm.fireAllRules(); en = (new Date()).getTime(); System.out.println("rules = " + (en-st)); // Clean up wm.dispose(); System.out.println("\n"); } This code is based on the HelloWorldExample.java code from the manual and the setup for the rule base is the same as in the manual. As the number of facts increases runtime increases dramatically: facts -- runtime (ms) 10 -- 168 20 -- 166 30 -- 344 40 -- 587 50 -- 1215 60 -- 1931 70 -- 2262 80 -- 3000 90 -- 4754 with a maximum memory use of about 428 MB RAM. By contrast, if I use sequential stateless sessions, everything runs in about 1-5 ms. Is there something in my set up that would cause this, or is this how one would expect Drools to scale? I read about people using thousands of facts so I suspect I'm setting something up incorrectly. Any help appreciated! Ron _ The other season of giving begins 6/24/08. Check out the i'm Talkathon. http://www.imtalkathon.com?source=TXT_EML_WLH_SeasonOfGiving ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users
Re: [rules-users] Drools 4 poor performance scaling?
Ron Kneusel wrote: Mark- I sent the zip file from a separate account because of Hotmail limitations. Please let me know if you got it and thanks for taking a look. Yes I got it, thanks. It will take me a few days to get round to looking into it though, I'll try and get back to you by tue/wed. Ron Date: Thu, 26 Jun 2008 18:37:26 +0100 From: [EMAIL PROTECTED] To: rules-users@lists.jboss.org Subject: Re: [rules-users] Drools 4 poor performance scaling? Ron, Can you send me your project in zip, i'd be interesting to see this. Mark _ Need to know now? Get instant answers with Windows Live Messenger. http://www.windowslive.com/messenger/connect_your_way.html?ocid=TXT_TAGLM_WL_Refresh_messenger_062008 ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users
RE: [rules-users] Drools 4 poor performance scaling?
Mark- I sent the zip file from a separate account because of Hotmail limitations. Please let me know if you got it and thanks for taking a look. Ron > Date: Thu, 26 Jun 2008 18:37:26 +0100 > From: [EMAIL PROTECTED] > To: rules-users@lists.jboss.org > Subject: Re: [rules-users] Drools 4 poor performance scaling? > > Ron, > > Can you send me your project in zip, i'd be interesting to see this. > > Mark > _ Need to know now? Get instant answers with Windows Live Messenger. http://www.windowslive.com/messenger/connect_your_way.html?ocid=TXT_TAGLM_WL_Refresh_messenger_062008 ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users
Re: [rules-users] Drools 4 poor performance scaling?
Ron, Can you send me your project in zip, i'd be interesting to see this. Mark Ron Kneusel wrote: I am testing Drools 4 for our application and while sequential mode is very fast I get very poor scaling when I increase the number of facts for stateful or stateless sessions. I want to make sure I'm not doing something foolish before deciding on whether or not to use Drools because from what I am reading online it should be fast with the number of facts I have. The scenario: I have 1000 rules in a DRL file. They are all of the form: rule rule when Data(type == 0, value> 0.185264); Data(type == 3, value < 0.198202); then insert(new AlarmRaised(0)); warnings.setAlarm(0, true); end where the ranges checked on the values and the types are randomly generated. Then, I create a Stateful session and run in a loop timing how long it takes the engine to fire all rules as the number of inserted facts increases: // Run for(j=0; j < 100; j+=5) { if (j==0) { nfacts = 1; } else { nfacts = j; } System.out.println(nfacts + ":"); // Get a working memory StatefulSession wm = ruleBase.newStatefulSession(); // Global - output warnings = new Alarm(); wm.setGlobal("warnings", warnings); // Add facts st = (new Date()).getTime(); for(i=0; i < nfacts; i++) { wm.insert(new Data(rand.nextInt(4), rand.nextDouble()-0.5)); } en = (new Date()).getTime(); System.out.println("facts = " + (en-st)); // Now run the rules st = (new Date()).getTime(); wm.fireAllRules(); en = (new Date()).getTime(); System.out.println("rules = " + (en-st)); // Clean up wm.dispose(); System.out.println("\n"); } This code is based on the HelloWorldExample.java code from the manual and the setup for the rule base is the same as in the manual. As the number of facts increases runtime increases dramatically: facts -- runtime (ms) 10 -- 168 20 -- 166 30 -- 344 40 -- 587 50 -- 1215 60 -- 1931 70 -- 2262 80 -- 3000 90 -- 4754 with a maximum memory use of about 428 MB RAM. By contrast, if I use sequential stateless sessions, everything runs in about 1-5 ms. Is there something in my set up that would cause this, or is this how one would expect Drools to scale? I read about people using thousands of facts so I suspect I'm setting something up incorrectly. Any help appreciated! Ron _ The other season of giving begins 6/24/08. Check out the i’m Talkathon. http://www.imtalkathon.com?source=TXT_EML_WLH_SeasonOfGiving ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users ___ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users