About xml parsing a huge state machine could mean huge xml files and therefore you should consider the use of xml native databases.
You have some good ones for free like eXist[1] or xindice[2]. I have found that eXist handles huge xml files better than xindice btw. You have also a lot of commercial ones like Tamino[3] from SoftwareAG. [1]http://exist.sourceforge.net/ [2]http://xml.apache.org/xindice/ [3]http://www.softwareag.com/corporate/products/tamino/default.asp --- Rahul Akolkar <[EMAIL PROTECTED]> wrote: > On 4/21/06, Wait, David L PWR > <[EMAIL PROTECTED]> wrote: > > > > Thanks Rahul for all your responses to questions. > > > > I have been watching SCXML evolve at the w3c site > and the scxml commons > > project with much interest. > > > <snip/> > > Great, IMO, such overlap is useful at both ends. > > > > Here's the usecase we are interested in. We are > exploring improving our > > time-domain simulations of power generation and > distribution networks > > having very many interacting objects (at least > thousands). Imagine a > > simulation very similar to a scaled-up version of > the stop-watch > > example; maybe a dozen different kinds of > "stop-watches", each having > > thousands of instances interacting with each other > in a way that depends > > on the current status of their > finite-state-machines. Other kinds of > > classes would be designed to solve a network of > physics-based equations > > where the "knowns" depend on the current statuses > of states and the > > "unknowns" are other properties evolving over > time. Most events are > > triggered by the objects themselves; others may be > triggered through > > UIs. > > > <snap/> > > Quite interesting, I believe that the value of using > a well-defined > state chart notation like SCXML grows as the size > and complexity of > the system being modeled. > > > > Given my limited description, do you foresee any > issues to watch out for > > in this kind of application? > <snip/> > > You ask hard questions ;-) > > SHORT ANSWER: > > I haven't done any profiling on a scale of magnitude > even close to > what you are talking about. You're probably aware > that until about > last week, the [SCXML] component in Jakarta Commons > was considered to > be a sandbox component (it was promoted out of > sandbox earlier in the > week). While efforts were ofcourse made to write > efficient library > code, the primary focus up until now has been > correctness, and will > probably continue to be that way atleast for a > while. I suggest doing > some experiments on a smaller scale so you can judge > the scalability > of the library for yourself. We will very much > appreciate if you > report back any inefficiencies you discover. > > LONG ANSWER (may contain obvious statements, sorry > about that): > > Scalability is affected by many factors, efficiency > of the underlying > library is only one of them. While dealing with > orders of magnitude > you mention above, some of the assumptions we have > to make so we can > focus on [SCXML] are: > > * You have hardware to match > * You have middleware to match, and is "configured > for efficiency" > * The application code is well-written > > Therefore, a suitable path to using the Commons > SCXML implementation > for your endeavor would probably be: > > (a) "Quickly" design a prototype system with only a > few flavors (say > couple of smaller state machines, instead of a dozen > larger one) and > fewer instances (say a hundred or two active state > machine instances) > (b) Employ good application authoring practices > (such as creating an > executor instance only when needed, and disposing an > instance once it > runs to completion, etc.) > (c) Simulate, test performance, profile if > unsatisfactory, report > findings here, submit patches to dev list etc. > (d) Probably iterate (a) through (c) a few times > > This would actually be very helpful for the > community, since this has > to be done only so often, and benefits everyone. > > That brings us to what we do know about Commons > SCXML today from a > performance PoV. I have done some minimal profiling, > and nothing > really has stood up as an alarm until now. I took > this opportunity to > post some CPU times on a 1.4.2 HotSpot JVM [1] for > the standalone > command line class StandaloneJexlExpressions [2] > running the > microwave-01 [3] sample (since you've probably seen > this in the W3C > WD) through a couple of "cook cycles". > > The results are pretty much as expected, IMO. Some > commentary: > > * The > org.apache.{crimson,xalan,commons.{beanutils,digester,scxml.io}}.* > packages have to do with SCXML IO, parsing and > serialization is > expensive. However, the Commons SCXML model is now > stateless (thanks > to Tim O'Brien for the timely nudge), meaning in > your above usecase of > 12 types of state machines each having a 1000 > instances, this cost is > incurred only 12 times, instead of 12000. Thus, > we've gone from paying > a linear price to a (low) constant price. I suspect > many of the String > operations we see are also tied to the SCXML IO > bits, and therefore > have similar constant costs. > > * The org.apache.commons.jexl.* packages have to do > with expression > evaluation (in this document, we're using JEXL [4] > expressions). > Expression language parsing is also expensive, but > there is not much > Commons SCXML can do about it. > > * The > {org.apache.commons.logging.*,java.util.logging.*} > are logging > overheads. This particular test class uses extensive > logging, > including a simple (purely logs callbacks) > SCXMLListener, > EventDispatcher and Tracer. Adding these logging > bits is an > application dependent choice, though I don't think > the logging > overheads are significant enough to lose their > value-add in any case. > > Those are pretty much the relevant entries from an > [SCXML] > perspective, IMO. Therefore, I haven't felt the urge > to dig any > deeper. > > -Rahul > > (long, possibly fragmented URLs below) > > [1] > http://people.apache.org/~rahul/commons/scxml/cpu-times.txt > [2] > http://svn.apache.org/viewcvs.cgi/jakarta/commons/sandbox/scxml/trunk/src/main/java/org/apache/commons/scxml/test/StandaloneJexlExpressions.java?view=markup > [3] > http://svn.apache.org/repos/asf/jakarta/commons/sandbox/scxml/trunk/src/test/java/org/apache/commons/scxml/env/jexl/microwave-01.xml > [4] http://jakarta.apache.org/commons/jexl/ > > > > > > --Dave > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > [EMAIL PROTECTED] > For additional commands, e-mail: > [EMAIL PROTECTED] > > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
