Hi,
> First of all, none of the implementations are transaction aware.
Correct. There is no concept of transactions in a "normal" rule engine
world. And, the problem is not trivial. As rolling back some
"external" actions is easy to implement, proceeding with the rule
inference and remembering all the involved steps requires creation of a
local cached copy of all the changes (one changed single bean attribute,
which triggered some rules firing, which changed other beans attrutes,
which in term triggered other rules to fire, etc - simply, you have a
chain reaction, with changes to rete internal state (facts) and with
external actions changing the external data).
> The RETE algorithm appears to involve caching all data relevant to rule
> invocation, and responding only to changed information.
Correct. Once run, inference engine evaluates all the rules, performes
all necessary matching and action execution, and then triggers matching
only on fact/rule changes.,
> An EJB application presumably has more data than can be cached in memory,
> transactions or no.
Yes and no. It depends how you are performing pattern matching. Normally,
RETE implementations create a (shadow) copy of all the facts as a rete
network, and operates on such constructed representation. This is fast,
but memory usage is really high. One could (possibly) extract and fetch
data when needed, kind of lazy evaluation, which would be much slower, but
one does not need to create a shadow copy of all the data. "Data" is
probably more or less equivalent of "facts" in the rule engine
terminology.
> [...]
> How is the rule engine to be activated? The idea seems to be that the rule
> engine responds to changes in the fact set. The most obvious such changes
> to me are changes to entity beans. I think perhaps that we could also
> regard someone calling a session bean method as a change in the fact set.
> [...]
I think this direction is the most intuitive, and I am for trying it
out. One question we still need to answer, is, whether session beans are
the only persistent data (facts) the engine operates on. If yes, and if
rule engine rules cannot create any intermediate, internal state, that
would simplify a lot transaction management. It would mean, that we could
have engine instance per transaction, and rollback/commit is fully based
on J2EE transaction support. I need to think on that a little more.
> [...]
> This model for rule engine connectivity does not involve the code in
> session or entity EJBs knowing anything about the rule engine. The
> connection is entirely declarative. If most of the application business
> flow/rule logic is implemented in the rule engine, this could possibly
> result in session beans all of whose method bodies were empty -- all action
> taken by the rule engine as a result of the method call, dispatched by the
> interceptors.
Exactly. This is probably what we would try to achieve - a complete
transparent for beans implementation, and declarative
architecture. However, I suspect this will require building 'dynamic rete'
matching module.
> Setting up the connection between the rule engine and the EJBs would
> involve specifying which home/ remote methods or CMP set accessor methods
> would notify the rule engine, and what the content (type identity and
> attributes) sent to the rule engine is to be. Also, assert actions need to
> be mapped to creates, retract to remove, and modify to some appropriate EJB
> method.
I need more thinking here. It may be better to 'declare' which actual
beans should be used, instead of just declaring the types (the rule engine
has to fetch the data from somewhere when needed, etc.)
>[...]
>
> Another aspect of this whole area is forward/ backward chaining. My
> understanding is that forward chaining is usually appropriate when you have
>[...]
Jess implements backward chaining by a special design pattern on the rules
via forward chaining. As said, forward chaining is more suitable in some
cases, and backward chaning in another. Forward chaning requires all the
intermediate facts to be stored in the fact set - this, for some types of
problems is unacceptable, as it leads to explosion of facts.
Backward chaining disposes all intermediate states just after unification
is completed. So, if you will need your intermediate states later, and you
do not care about memory, and wants high performance, forward will be
best, if you want a small memory usage, and you do many operations on
very different domains, backward would be the way to go.
I know Jess, but do not know much about JEOPS. Hower, due to a special
properties of entity beans (can mapped to thousends of facts) I suspect
one will need to make more dynamic implementation - implementation which
fetches the facts when needed.
> So, this are my first thoughts. Does this make sense? I could certainly
> use some good references on this subject, especially some realistic
> examples.
Example: lets say we have two entity beans, one representing books written
by some authors, with one of the properties specifying how many copies
of this book got already sold, and second representing authors. Lets say
one of the properties of an author is, whether author is popular or not (a
boolean property). Lets say now, that we have a business rule, declared in
our rule engine, that
"if the latest book by the given author was sold in more than 50k copies,
the author is popular"
We have to think it over:
*) books and authors may exceed the memory limit for statically created
RETE network, what we do about it?
*) how we should process "latest" (there are several ways, for example,
one could split the rule above into several rules, some of which would
look for latest book by particular author - however this is the internal
state of the rule engine, which I think we should avoid; another would be
to delegate the job of finding the latest book by a given author to the
business method on Author bean for example, it would be the most efficient
way, but then, not that much declarative anymore, etc)
> Is this the right place to discuss this? Or should I keep my mouth shut
> till I have some kind of prototype?
Don't know. We can discuss it here, or we may create a seperate discussion
group for the rule engine API and implementation bits, and discuss here
only the issues for jBoss integration.
> All comments suggestions, and coworkers gratefully appreciated.
Count me in. I will try to find some time to do some thinking and
experimenting on this.
best regards
Mariusz