Hi Rinke,

First of all, I've prefixed my reply with [SCXML] which is the convention on the Apache Commons lists so we can easily group and identify specific component related messages.

I've added more specific comments inline below.

On 09-05-14 11:40, R.C. Hoekstra wrote:
Hi list,

As written before, We're a university team of scientists working on multi
agent simulations of tropical diseases for a world health organization
project. A disease can be considered as a state machine, with the patient
going through various states and transitions, each triggering new events.

We've managed to make a working example of a xml file where a patient is
going through various stages of the disease, including treatments with
medicine. Our most important concern at the moment is: how efficient is it?
Our aim is a multi agent simulation with possibly a few 100,000 of instances
of a State Machine engine (SCXMLExecutor). I'd like to share the code setup
with you guys, and maybe you can give some clues on how efficient it will be
in terms of performance/speed, and maybe some hints if an alternative
approach would be better?

Maybe, but its a bit difficult in the abstract without having more concrete information on how you setup the project and certain usages.

If you have the code publicly viewable it will definitely be helpful if you can share that.

Also very important (but that should become clear if we can view the code) is which version of Commons SCXML you've set this up. As you know, the current SCXML trunk is a major rewrite compared to the old and outdated 0.9 release. If you currently are still using the 0.9 release, it might be difficult (certainly to me) to provide concrete feedback and help as I'm only focusing on the trunk 2.0 version.


general setup we have a Population object (a wrapped list) containing all
agent objects. Each agent is assigned an SCXMLExecutor as the engine, so
there are many instances of SCXMLExecutor. We use the default JexlEvaluator,
and each SCXMLExecutor gets the agent it belongs to assigned to the
rootContext, so the agent's properties can be accessed from the scxml file.

Transitions Our transitions are usually of a special type: a patient usually
stays x days in a certain state, after which the transition takes place. The
x days is determined on basis of drawing a random number from a statistical
distribution. There is usually more than one possible transition; each with
different probabilities. So the scxml file must contain the following
information: * distribution name and parameters to determine the time until
next transition. * A number coupled to each possible transition indicating
the likelyhood that it happens.

With potentially a 100K+ agents/SCXML instances concurrently, running for x number of days, I can imagine memory becoming an issue. Or maybe not. Are you (intending) to use some level of SCXML state serialization/de-serialization to keep memory footprint under control, or is everything expected to be kept running in memory? What are your environment (hardware) conditions/constraints?

We solved this in the following way: * distribution name, mean and variance
parameters, and chances are defined in the datamodel as single variables:
<data id=”distr”> in each state's onentry we set these variables with the
values specific for that state, via the assign tag. The chances variable is
defined as an array: <assign name="chances" expr="[0.05d, 0.10d, 0.20d]" /> *
The state's onentry also contains a send tag. Send passes the agent's id, the
forementioned variables and the event concerned. The send message is captured
by our own implementation of EventDispatcher. This does two things: ** It
draws the random time based on the passed distribution parameters. It
schedules this in our own discreet event manager. When the desired time has
passed, the discreet event manager passes the correct event back to the
correct SCXMLExecutor instance. ** It determines which transition will be
chosen by drawing a random number on basis of the chances array. This results
in an index number of the transition to be chosen. This index number is
passed as payload to the event. The scxml file checks this index number in
the cond attribute of the transitions.

Performance wise, I don't think the SCXML engine itself likely becoming an issue, but maybe your custom EventDispatcher/event-manager interaction might, certainly if these (all?) have to run on separate threads. You probably need or already have implemented some custom instance-to-event mapping solution for this?
Using 100K+ separate threads isn't likely to work ;)


Agent properties: Each disease state also has its effect on the agent's
properties, for example the infectivity of the agent, or its fitness. The
agent was passed to the rootContext, so the onentry of each state contains
code to set the agent's properties specific to that state: <script>
agent.infectivity = 1 </script>

This is our overall approach. I'd be happy to receive any comments;
specifically tips regarding the expected speed/performance.

Persistence of your data would be the next thing I'd need more information about. Are you 'just' using SCXML serialization to write out and save your data and results, or maybe you have (need for) some custom database storage solution? I expect you'll need to do analysis of the results and doing that on 100K+ SCXML documents seems a bit verbose and highly inefficient to me :)

The current (trunk) SCXML datamodel handling and implementation definitely needs improvements, as also already is on the roadmap as well. And in your use-case using (only) the datamodel to store the data without a separate/secondary backend storage might be cause for some concern.

Looking forward to more detailed information: your project definitely is very interesting and at a scale which I'd love to see Commons SCXML being used.

I can't really say if Commons SCXML *today* will be able to perform and scale well enough for this, but if not yet I'm definitely willing to help improving and fixing it if feasible.

Regards, Ate

best regards, Rinke

--------------------------------------------------------------------- To
unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional
commands, e-mail: user-h...@commons.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org

Reply via email to