Re: [rules-users] help with diagnosing CEP performance issue

2010-12-26 Thread radai
ah, events werent being evicted because RHS was only evaluated during
fire(), which was after a batch.
i have altered my test code according to your suggestion, to look like this:

public void testLargeBatches() {
Node node1 = new Node(node 1);
node1.setId(1L);
s.insert(node1);

clock.advanceTime(1L, TimeUnit.DAYS);

int batchSize = 4;
int daysToRun = 10;
ArrayListBackup batch = new ArrayListBackup(batchSize);
ArrayListLong nodeIds = new ArrayListLong();
nodeIds.add(1L);
DateTime engineTime;

long start;
long accumulated = 0;
long currentTime;
long eventTime;
long delta;

System.err.println(days: +daysToRun+. batch: +batchSize+.
rules: +ruleFile);

for (int i=0; idaysToRun; i++) {
engineTime = new DateTime(clock.getCurrentTime(),
DateTimeZone.UTC);
batch.clear();
//generate random
System.err.println(preparing batch  + i);
BackupGenerationUtil.fillBackupBatch(batch, nodeIds,
engineTime.minusHours(23).minusMinutes(59).minusSeconds(59),
engineTime.minusSeconds(1), batchSize);

System.err.println(feeding batch +i);
start = System.currentTimeMillis();

currentTime = clock.getCurrentTime();

//feed the batch in
for (Backup b : batch) {
eventTime = b.getFinished().getTime();
delta = (eventTime-currentTime);
clock.advanceTime(delta, TimeUnit.MILLISECONDS);
currentTime = clock.getCurrentTime();
backupStream.insert(b);
s.fireAllRules();
}

//advance a day (whats left of it)
delta = engineTime.plusDays(1).getMillis()-currentTime;
clock.advanceTime(delta, TimeUnit.MILLISECONDS);
s.fireAllRules();

accumulated += (System.currentTimeMillis() - start);

System.err.println(batch complete +i);
}

long averageBatchProcessingTime =
(long)(accumulated/(double)daysToRun);
double perBackupEventTime =
((double)accumulated)/(batchSize*daysToRun);
int eventsPerSecond = (int)(1000D/perBackupEventTime);

System.err.println(average batch processing time is
+averageBatchProcessingTime+ which is +perBackupEventTime+/event or
+eventsPerSecond+ events/second);
}

the batch is now ordered by event end time (inside fillBackupBatch() ), the
clock is advanced to the end of each event as it is processed and rules are
fired after each insertion. this code will eventually be run inside a j2ee
environment so i would rather not use fireUntilHalt() (also, using it
maxed-out a single core - does it do busy-waiting internally?!). also, the
batch nature is by design. the system will process events in a batch
fashion, not truly one by one. in production there will be many such batches
per day, but they could be very sizable.

for 40K events/day over 10 days, the above code (on my machine) lands me 59K
events/sec without the 2nd rule and 5300 events/sec with the second rule.
while this is an improvement of x2-x3 compared to my prev. code (24k without
the 2nd rule and 1400 with), for which i am very glad, i still cannot
understand why the 2nd rule slows me down by x10.

anything else i might do to get the performance if the 2 rules close to the
performance of the single rule?

2010/12/26 Wolfgang Laun wolfgang.l...@gmail.com

 You are inserting a day's worth of backup events at a time. In this loop
 the engine is forced to evaluate both rules, for an ever-increasing number
 of events. It's the second rule that requires a comparison between any two
 backup events, and this causes a quadratic increment in processing time.

 CEP is short for continuous event processing, and that's what you should
 let the engine do, i.e., not only evaluate the conditions but also the
 execute the consequences. This is done by calling fireAllRules() after each
 insertion. Alternatively, consider running fireUntilHalt() in once thread
 and insert in another thread.

 Ideally, the test should generate events ordered by time, and the
 pseudo-clock might be advanced due to the time stamp in the generated
 events. This would give you a very close approximation to a production run.

 -W


 2010/12/26 radai radai.rosenbl...@gmail.com

 well, fixed now :-)
 the benchmark (part of a junit) looks like this:

 protected void init() {
 KnowledgeBuilderConfiguration bConf =
 KnowledgeBuilderFactory.newKnowledgeBuilderConfiguration();
 KnowledgeBuilder builder =
 KnowledgeBuilderFactory.newKnowledgeBuilder(bConf);
 builder.add(...);

 KnowledgeBaseConfiguration kbConf =
 KnowledgeBaseFactory.newKnowledgeBaseConfiguration();
 kbConf.setOption( EventProcessingOption.STREAM );
 //todo - turn this on when
 https://issues.jboss.org/browse/JBRULES-2845 is fixed
 

Re: [rules-users] help with diagnosing CEP performance issue

2010-12-25 Thread radai
well, fixed now :-)
the benchmark (part of a junit) looks like this:

protected void init() {
KnowledgeBuilderConfiguration bConf =
KnowledgeBuilderFactory.newKnowledgeBuilderConfiguration();
KnowledgeBuilder builder =
KnowledgeBuilderFactory.newKnowledgeBuilder(bConf);
builder.add(...);

KnowledgeBaseConfiguration kbConf =
KnowledgeBaseFactory.newKnowledgeBaseConfiguration();
kbConf.setOption( EventProcessingOption.STREAM );
//todo - turn this on when
https://issues.jboss.org/browse/JBRULES-2845 is fixed
//kbConf.setOption( MultithreadEvaluationOption.YES );
kb = KnowledgeBaseFactory.newKnowledgeBase(kbConf);
kb.addKnowledgePackages(builder.getKnowledgePackages());

KnowledgeSessionConfiguration sConf =
KnowledgeBaseFactory.newKnowledgeSessionConfiguration();
sConf.setOption(ClockTypeOption.get(pseudo));
Environment env = EnvironmentFactory.newEnvironment();
s = kb.newStatefulKnowledgeSession(sConf ,env);

clock = (SessionPseudoClock) s.getSessionClock();

backupStream = s.getWorkingMemoryEntryPoint(Backup Stream);
}

followed by a test method that looks like this:

public void testLargeBatches() {

Node node1 = new Node(node 1);
node1.setId(1L);
s.insert(node1);
s.fireAllRules();

clock.advanceTime(1L, TimeUnit.DAYS);
s.fireAllRules();

int batchSize = 4;
int daysToRun = 1;
ArrayListBackup batch = new ArrayListBackup(batchSize);
ArrayListLong nodeIds = new ArrayListLong();
nodeIds.add(1L);
DateTime engineTime;

long start;
long accumulated = 0;

System.err.println(days: +daysToRun+. batch: +batchSize+.
rules: +ruleFile);

for (int i=0; idaysToRun; i++) {
engineTime = new DateTime(clock.getCurrentTime(),
DateTimeZone.UTC);
batch.clear();
//generate random
BackupGenerationUtil.fillBackupBatch(batch, nodeIds,
engineTime.minusHours(23).minusMinutes(59).minusSeconds(59),
engineTime.minusSeconds(1), batchSize);

start = System.currentTimeMillis();

//feed the batch in
for (Backup b : batch) {
backupStream.insert(b);
}

//advance a day and run all rules
s.fireAllRules();
clock.advanceTime(1L, TimeUnit.DAYS);
s.fireAllRules();

accumulated += (System.currentTimeMillis() - start);
}

long averageBatchProcessingTime =
(long)(accumulated/(double)daysToRun);
double perBackupEventTime =
((double)accumulated)/(batchSize*daysToRun);
int eventsPerSecond = (int)(1000D/perBackupEventTime);

System.err.println(average batch processing time is
+averageBatchProcessingTime+ which is +perBackupEventTime+/event or
+eventsPerSecond+ events/second);
}

running this with or without the 2nd trimming rules makes a huge
difference. on my machine i can run the non-trimming version in batches of
300K @~2500 events/second @1GB RAM. with the trimming rule i cant get
anywhere near that - the above value (40K) is already enough to slow it down
to ~1800 events/second, and i dont understand whats so heavy about my
attempt to save memory.

any help/thoughts/clues would be most welcome.


2010/12/23 Mauricio Salatino sala...@gmail.com

 hehe yes.. the same happens to me :)

 2010/12/23 Wolfgang Laun wolfgang.l...@gmail.com

 (OT: your nick me is confusing other gmail users. I see your mail as if I
 had sent it to myself. I was almost binning it.)

 The delay you see may depend on the way you run the test with 40K events.
 Please describe the sequence of insertions and calls to fire...(), the
 session setup (pseudo-clock, etc.)

 -W


 2010/12/23 me radai.rosenbl...@gmail.com

  Hi.

 im trying to demo drools-fusion for a system that processes backup
 events.
 i have the following 2 CEP rules:

 rule Backup Not Succeeded For At Least 3 Days
 @ruleId(1)
 when
 Node($id : id)
 not ( Backup(clientId == $id, $state: state ==
 BackupStateEnum.FINISHED) over window:time( 3d ) from entry-point Backup
 Stream )
 then
 //nothing for now
 end

 rule Prune Previous Successful Backups
 @ruleId(2)
 when
 $prevBackup  : Backup($id : clientId,  state ==
 BackupStateEnum.FINISHED) from entry-point Backup Stream
 $newerBackup : Backup(clientId == $id, state ==
 BackupStateEnum.FINISHED, this after $prevBackup) from entry-point Backup
 Stream
 then
 drools.retract($prevBackup);
 end

 rule #2 is my attempt to cut down on memory usage. the problem is that
 rule #2 slows down the processing very very considerably (2 orders of
 magnitude slower when simulating batches of 40K events/day).

 what would be my next step in trying to find the reason for it? my
 intuition tells me that with 2 events in the system at any given time
 instead of 120K (single node, 40K events/day 3 day window) i 

Re: [rules-users] help with diagnosing CEP performance issue

2010-12-25 Thread radai
just to be clear, the fixed was only referring to my nick on gmail ... ;-)

On Sun, Dec 26, 2010 at 6:24 AM, radai radai.rosenbl...@gmail.com wrote:

 well, fixed now :-)
 the benchmark (part of a junit) looks like this:

 protected void init() {
 KnowledgeBuilderConfiguration bConf =
 KnowledgeBuilderFactory.newKnowledgeBuilderConfiguration();
 KnowledgeBuilder builder =
 KnowledgeBuilderFactory.newKnowledgeBuilder(bConf);
 builder.add(...);

 KnowledgeBaseConfiguration kbConf =
 KnowledgeBaseFactory.newKnowledgeBaseConfiguration();
 kbConf.setOption( EventProcessingOption.STREAM );
 //todo - turn this on when
 https://issues.jboss.org/browse/JBRULES-2845 is fixed
 //kbConf.setOption( MultithreadEvaluationOption.YES );
 kb = KnowledgeBaseFactory.newKnowledgeBase(kbConf);
 kb.addKnowledgePackages(builder.getKnowledgePackages());

 KnowledgeSessionConfiguration sConf =
 KnowledgeBaseFactory.newKnowledgeSessionConfiguration();
 sConf.setOption(ClockTypeOption.get(pseudo));
 Environment env = EnvironmentFactory.newEnvironment();
 s = kb.newStatefulKnowledgeSession(sConf ,env);

 clock = (SessionPseudoClock) s.getSessionClock();

 backupStream = s.getWorkingMemoryEntryPoint(Backup Stream);
 }

 followed by a test method that looks like this:

 public void testLargeBatches() {

 Node node1 = new Node(node 1);
 node1.setId(1L);
 s.insert(node1);
 s.fireAllRules();

 clock.advanceTime(1L, TimeUnit.DAYS);
 s.fireAllRules();

 int batchSize = 4;
 int daysToRun = 1;
 ArrayListBackup batch = new ArrayListBackup(batchSize);
 ArrayListLong nodeIds = new ArrayListLong();
 nodeIds.add(1L);
 DateTime engineTime;

 long start;
 long accumulated = 0;

 System.err.println(days: +daysToRun+. batch: +batchSize+.
 rules: +ruleFile);

 for (int i=0; idaysToRun; i++) {
 engineTime = new DateTime(clock.getCurrentTime(),
 DateTimeZone.UTC);
 batch.clear();
 //generate random
 BackupGenerationUtil.fillBackupBatch(batch, nodeIds,
 engineTime.minusHours(23).minusMinutes(59).minusSeconds(59),
 engineTime.minusSeconds(1), batchSize);

 start = System.currentTimeMillis();

 //feed the batch in
 for (Backup b : batch) {
 backupStream.insert(b);
 }

 //advance a day and run all rules
 s.fireAllRules();
 clock.advanceTime(1L, TimeUnit.DAYS);
 s.fireAllRules();

 accumulated += (System.currentTimeMillis() - start);
 }

 long averageBatchProcessingTime =
 (long)(accumulated/(double)daysToRun);
 double perBackupEventTime =
 ((double)accumulated)/(batchSize*daysToRun);
 int eventsPerSecond = (int)(1000D/perBackupEventTime);

 System.err.println(average batch processing time is
 +averageBatchProcessingTime+ which is +perBackupEventTime+/event or
 +eventsPerSecond+ events/second);
 }

 running this with or without the 2nd trimming rules makes a huge
 difference. on my machine i can run the non-trimming version in batches of
 300K @~2500 events/second @1GB RAM. with the trimming rule i cant get
 anywhere near that - the above value (40K) is already enough to slow it down
 to ~1800 events/second, and i dont understand whats so heavy about my
 attempt to save memory.

 any help/thoughts/clues would be most welcome.


 2010/12/23 Mauricio Salatino sala...@gmail.com

 hehe yes.. the same happens to me :)

 2010/12/23 Wolfgang Laun wolfgang.l...@gmail.com

 (OT: your nick me is confusing other gmail users. I see your mail as if
 I had sent it to myself. I was almost binning it.)

 The delay you see may depend on the way you run the test with 40K events.
 Please describe the sequence of insertions and calls to fire...(), the
 session setup (pseudo-clock, etc.)

 -W


 2010/12/23 me radai.rosenbl...@gmail.com

  Hi.

 im trying to demo drools-fusion for a system that processes backup
 events.
 i have the following 2 CEP rules:

 rule Backup Not Succeeded For At Least 3 Days
 @ruleId(1)
 when
 Node($id : id)
 not ( Backup(clientId == $id, $state: state ==
 BackupStateEnum.FINISHED) over window:time( 3d ) from entry-point Backup
 Stream )
 then
 //nothing for now
 end

 rule Prune Previous Successful Backups
 @ruleId(2)
 when
 $prevBackup  : Backup($id : clientId,  state ==
 BackupStateEnum.FINISHED) from entry-point Backup Stream
 $newerBackup : Backup(clientId == $id, state ==
 BackupStateEnum.FINISHED, this after $prevBackup) from entry-point Backup
 Stream
 then
 drools.retract($prevBackup);
 end

 rule #2 is my attempt to cut down on memory usage. the problem is that
 rule #2 slows down the processing very very considerably (2 orders of
 magnitude slower when simulating 

Re: [rules-users] help with diagnosing CEP performance issue

2010-12-25 Thread Wolfgang Laun
You are inserting a day's worth of backup events at a time. In this loop the
engine is forced to evaluate both rules, for an ever-increasing number of
events. It's the second rule that requires a comparison between any two
backup events, and this causes a quadratic increment in processing time.

CEP is short for continuous event processing, and that's what you should
let the engine do, i.e., not only evaluate the conditions but also the
execute the consequences. This is done by calling fireAllRules() after each
insertion. Alternatively, consider running fireUntilHalt() in once thread
and insert in another thread.

Ideally, the test should generate events ordered by time, and the
pseudo-clock might be advanced due to the time stamp in the generated
events. This would give you a very close approximation to a production run.

-W


2010/12/26 radai radai.rosenbl...@gmail.com

 well, fixed now :-)
 the benchmark (part of a junit) looks like this:

 protected void init() {
 KnowledgeBuilderConfiguration bConf =
 KnowledgeBuilderFactory.newKnowledgeBuilderConfiguration();
 KnowledgeBuilder builder =
 KnowledgeBuilderFactory.newKnowledgeBuilder(bConf);
 builder.add(...);

 KnowledgeBaseConfiguration kbConf =
 KnowledgeBaseFactory.newKnowledgeBaseConfiguration();
 kbConf.setOption( EventProcessingOption.STREAM );
 //todo - turn this on when
 https://issues.jboss.org/browse/JBRULES-2845 is fixed
 //kbConf.setOption( MultithreadEvaluationOption.YES );
 kb = KnowledgeBaseFactory.newKnowledgeBase(kbConf);
 kb.addKnowledgePackages(builder.getKnowledgePackages());

 KnowledgeSessionConfiguration sConf =
 KnowledgeBaseFactory.newKnowledgeSessionConfiguration();
 sConf.setOption(ClockTypeOption.get(pseudo));
 Environment env = EnvironmentFactory.newEnvironment();
 s = kb.newStatefulKnowledgeSession(sConf ,env);

 clock = (SessionPseudoClock) s.getSessionClock();

 backupStream = s.getWorkingMemoryEntryPoint(Backup Stream);
 }

 followed by a test method that looks like this:

 public void testLargeBatches() {

 Node node1 = new Node(node 1);
 node1.setId(1L);
 s.insert(node1);
 s.fireAllRules();

 clock.advanceTime(1L, TimeUnit.DAYS);
 s.fireAllRules();

 int batchSize = 4;
 int daysToRun = 1;
 ArrayListBackup batch = new ArrayListBackup(batchSize);
 ArrayListLong nodeIds = new ArrayListLong();
 nodeIds.add(1L);
 DateTime engineTime;

 long start;
 long accumulated = 0;

 System.err.println(days: +daysToRun+. batch: +batchSize+.
 rules: +ruleFile);

 for (int i=0; idaysToRun; i++) {
 engineTime = new DateTime(clock.getCurrentTime(),
 DateTimeZone.UTC);
 batch.clear();
 //generate random
 BackupGenerationUtil.fillBackupBatch(batch, nodeIds,
 engineTime.minusHours(23).minusMinutes(59).minusSeconds(59),
 engineTime.minusSeconds(1), batchSize);

 start = System.currentTimeMillis();

 //feed the batch in
 for (Backup b : batch) {
 backupStream.insert(b);
 }

 //advance a day and run all rules
 s.fireAllRules();
 clock.advanceTime(1L, TimeUnit.DAYS);
 s.fireAllRules();

 accumulated += (System.currentTimeMillis() - start);
 }

 long averageBatchProcessingTime =
 (long)(accumulated/(double)daysToRun);
 double perBackupEventTime =
 ((double)accumulated)/(batchSize*daysToRun);
 int eventsPerSecond = (int)(1000D/perBackupEventTime);

 System.err.println(average batch processing time is
 +averageBatchProcessingTime+ which is +perBackupEventTime+/event or
 +eventsPerSecond+ events/second);
 }

 running this with or without the 2nd trimming rules makes a huge
 difference. on my machine i can run the non-trimming version in batches of
 300K @~2500 events/second @1GB RAM. with the trimming rule i cant get
 anywhere near that - the above value (40K) is already enough to slow it down
 to ~1800 events/second, and i dont understand whats so heavy about my
 attempt to save memory.

 any help/thoughts/clues would be most welcome.


 2010/12/23 Mauricio Salatino sala...@gmail.com

 hehe yes.. the same happens to me :)

 2010/12/23 Wolfgang Laun wolfgang.l...@gmail.com

 (OT: your nick me is confusing other gmail users. I see your mail as if
 I had sent it to myself. I was almost binning it.)

 The delay you see may depend on the way you run the test with 40K events.
 Please describe the sequence of insertions and calls to fire...(), the
 session setup (pseudo-clock, etc.)

 -W


 2010/12/23 me radai.rosenbl...@gmail.com

  Hi.

 im trying to demo drools-fusion for a system that processes backup
 events.
 i have the following 2 CEP rules:

 rule Backup Not Succeeded For At Least 3 Days
 

Re: [rules-users] help with diagnosing CEP performance issue

2010-12-23 Thread Wolfgang Laun
(OT: your nick me is confusing other gmail users. I see your mail as if I
had sent it to myself. I was almost binning it.)

The delay you see may depend on the way you run the test with 40K events.
Please describe the sequence of insertions and calls to fire...(), the
session setup (pseudo-clock, etc.)

-W


2010/12/23 me radai.rosenbl...@gmail.com

 Hi.

 im trying to demo drools-fusion for a system that processes backup events.
 i have the following 2 CEP rules:

 rule Backup Not Succeeded For At Least 3 Days
 @ruleId(1)
 when
 Node($id : id)
 not ( Backup(clientId == $id, $state: state ==
 BackupStateEnum.FINISHED) over window:time( 3d ) from entry-point Backup
 Stream )
 then
 //nothing for now
 end

 rule Prune Previous Successful Backups
 @ruleId(2)
 when
 $prevBackup  : Backup($id : clientId,  state ==
 BackupStateEnum.FINISHED) from entry-point Backup Stream
 $newerBackup : Backup(clientId == $id, state ==
 BackupStateEnum.FINISHED, this after $prevBackup) from entry-point Backup
 Stream
 then
 drools.retract($prevBackup);
 end

 rule #2 is my attempt to cut down on memory usage. the problem is that rule
 #2 slows down the processing very very considerably (2 orders of magnitude
 slower when simulating batches of 40K events/day).

 what would be my next step in trying to find the reason for it? my
 intuition tells me that with 2 events in the system at any given time
 instead of 120K (single node, 40K events/day 3 day window) i should have
 seen either a speedup or reduced memory consumption (if not both) and yet
 the results im seeing are the exact opposite.
 is there anything im missing ? i could try adding some sort of
 LastSuccessfulBackup marker object into main memory and updating it using
 events from the backup stream, but then how would i express rule #1 ?

 any help/clues/suggestions would be greatly appreciated,

radai.

 ___
 rules-users mailing list
 rules-users@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/rules-users


___
rules-users mailing list
rules-users@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users


Re: [rules-users] help with diagnosing CEP performance issue

2010-12-23 Thread Mauricio Salatino
hehe yes.. the same happens to me :)

2010/12/23 Wolfgang Laun wolfgang.l...@gmail.com

 (OT: your nick me is confusing other gmail users. I see your mail as if I
 had sent it to myself. I was almost binning it.)

 The delay you see may depend on the way you run the test with 40K events.
 Please describe the sequence of insertions and calls to fire...(), the
 session setup (pseudo-clock, etc.)

 -W


 2010/12/23 me radai.rosenbl...@gmail.com

 Hi.

 im trying to demo drools-fusion for a system that processes backup events.

 i have the following 2 CEP rules:

 rule Backup Not Succeeded For At Least 3 Days
 @ruleId(1)
 when
 Node($id : id)
 not ( Backup(clientId == $id, $state: state ==
 BackupStateEnum.FINISHED) over window:time( 3d ) from entry-point Backup
 Stream )
 then
 //nothing for now
 end

 rule Prune Previous Successful Backups
 @ruleId(2)
 when
 $prevBackup  : Backup($id : clientId,  state ==
 BackupStateEnum.FINISHED) from entry-point Backup Stream
 $newerBackup : Backup(clientId == $id, state ==
 BackupStateEnum.FINISHED, this after $prevBackup) from entry-point Backup
 Stream
 then
 drools.retract($prevBackup);
 end

 rule #2 is my attempt to cut down on memory usage. the problem is that
 rule #2 slows down the processing very very considerably (2 orders of
 magnitude slower when simulating batches of 40K events/day).

 what would be my next step in trying to find the reason for it? my
 intuition tells me that with 2 events in the system at any given time
 instead of 120K (single node, 40K events/day 3 day window) i should have
 seen either a speedup or reduced memory consumption (if not both) and yet
 the results im seeing are the exact opposite.
 is there anything im missing ? i could try adding some sort of
 LastSuccessfulBackup marker object into main memory and updating it using
 events from the backup stream, but then how would i express rule #1 ?

 any help/clues/suggestions would be greatly appreciated,

radai.

 ___
 rules-users mailing list
 rules-users@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/rules-users



 ___
 rules-users mailing list
 rules-users@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/rules-users




-- 
 - CTO @ http://www.plugtree.com
 - MyJourney @ http://salaboy.wordpress.com
 - Co-Founder @ http://www.jbug.com.ar

 - Salatino Salaboy Mauricio -
___
rules-users mailing list
rules-users@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users