Re: Rules in archetypes - what are the requirements?

Pieter Bos Sat, 02 Feb 2019 08:22:15 -0800

From: openEHR-technical <[email protected]> on behalf 
of Thomas Beale <[email protected]>
Reply-To: For openEHR technical discussions 
<[email protected]>
Date: Saturday, 2 February 2019 at 15:01
To: "[email protected]" <[email protected]>
Subject: Re: Rules in archetypes - what are the requirements?



Assuming you meant to put 'id7' as the first one, I don't understand what this 
achieves beyond just:

/events[id2]/data/items/element[id7]/value/magnitude >
        /events[id2]/data/items/element[id4]/value/magnitude +
        /events[id2]/data/items/element[id5]/value/magnitude +
        /events[id2]/data/items/element[id6]/value/magnitude

which says the same thing, for all events in the runtime data that conform to 
the /events[id2] branch of the structure.

Since the occurrences of events[id2] can be more than one, 
/events[id2]/data/items/element[id7]/value/magnitude in the execution 
environment (actual data) maps to  a List<Real>. That means you need to define 
operators such as +, > and = on a list of reals. Or to define somehow that a 
statement containing only path bindings within a single multiply-occurring 
structure will mean that it gets evaluated for each occurrence of such a 
structure. The second case is complicated if you need to include data outside 
/events[id2] in your expression. A real world use case would be data in 
/protocol, which can influence the interpretation of event data, but is outside 
of the event.
So we did the first in Archie, with a bit of tricks to make this work for 
assignments. For example, how the plus operator is interpreted in Archie:

+(Real r1, Real r2)
  return the sum both numbers
+(Real r1, List<Real> r2)
   return a list the same size as r2, with every element result[i] = r1 + r2[i]
+(List<Real> r1, r2)
   return a list the same size as r1, with every element result[i] = r1[i] + r2
+(List<Real>l r1, List<Real> r2)
  if r1 and r2 have different length, throw an exception. Otherwise, return a 
list the same size as r1, with every element result[i] being r1[i] + r2[i]

And the > operator:

>(Real r1, Real r2)
  return true if r1 > r2, false otherwise
>(Real r1, List<Real> r2)
  return a list of Booleans, one for every element in r2, true if r1 > that 
element
>(List<Real> r1, r2)
  return a list of Booleans, one for every element in r1, true if that element 
> r2
>(List<Real>l r1, List<Real> r2)
  if r1 and r2 have different length, throw an exception. Otherwise, return a 
list of Booleans, with every element result[i] being r1[i] > r2[i]

This is a simplification, in reality there is null-handling and integer-> real 
conversion involved, which is also missing in the specification I think.
We defined assertions/checks to only succeed if every Boolean in such a list is 
true or null/undefined (to not fail on data which is optional). Additionally in 
Archie every value returned contains every unique path in the data that was 
used to evaluate it, so you can see exactly for which data an assertion/check 
failed in the result, to notify the user where the problem occurs. Or to apply 
an assignment to the correct output if the output path does not match a single 
field. This last bit is implementation specific of course.

Much of this complexity can be avoided by  not defining the operator on 
lists/sets, and requiring the for_all loop on lists or sets of data in the 
specification. This comes at a price, because the author of the expressions 
needs to understand more of the language and data structures. So we chose the 
second, since the previous draft specification did not specify at all how to 
handle these cases.

Undefined value handling is another subject, I have not checked yet if the new 
proposal solves it. We defined some functions to handle this explicitly if the 
automatic handling doesn’t do it ((input , alternative) -> return input  unless 
input is undefined, then alternative), as well as some rounding functions.

If we were to allow the expression for_all $event in /data/events[id3] then we 
need to be clear on what it means: it actually refers to an object of type 
List<EVENT>, but do the members consist of EVENT objects found in data at the 
moment the expression is evaluated? Or just the statically defined members in 
the archetype - which can also be multiple, e.g. see the Apgar archetype, it 
has 1 min, 2 min, 5 min events?

You would need to evaluate on actual data. If you define it on the archetype 
data, you would need some kind of rules to convert it to an evaluation on 
actual data with different multiplicities than the rules specify, for example 
if events[id2] has occurrences > 1. Might be possible, I have not tried to 
define that. Would probably include some extra for_all loops plus some kind of 
validation errors for cases that cannot be converted.
So I would say always the data found at the moment which the expression is 
evaluated. You can still refer to separate statically defined members using 
their distinct node ids, and even those could have occurrences > 1 (not in the 
apgar example since those have occurrences {0..1} in the archetype).

Normally we want the processing of 'rules' expressions in archetypes to apply 
to the data when the archetype is being used in its normal role of creation / 
semantic validation at data commit time.

Agreed.

So it seems to me that if we want to support expressions like the above, we 
need to be able to do something like (don't worry about the concrete syntax too 
much, could easily be TS or java-flavoured):

use_model
    org.openehr.rm

data_context

    $xxx_events: List<EVENT>
    $item_aaa, $item_bbb, $item_ccc, $item_ddd: Real

definition

    check for_all event in $xxx_events:
        event/$item_aaa > event/$item_bbb + event/$item_ccc + event/$item_ddd

data_bindings -- pseudo-code for now

    $xxx_events -> /events[id2]
    $item_aaa -> /data/items/element[id7]/value/magnitude
    $item_bbb -> /data/items/element[id4]/value/magnitude
    $item_ccc -> /data/items/element[id5]/value/magnitude
    $item_ddd -> /data/items/element[id6]/value/magnitude

I don't know what this archetype is, so assume that $xxx_events, $item_aaa etc 
are more meaningful names.

That would leave $item_aaa globally defined as a path-reference, not matching a 
path in the archetype on its own. Type of variable being some kind of 
PATH_REFERENCE? That would solve the problem. Why not just inline the 
assignments, such as:

$events := /events[id2]

For_all event in $events:
    $item_aaa := /data/items/element[id7]/value/magnitude
    $item_bbb := /data/items/element[id4]/value/magnitude
    $item_ccc := /data/items/element[id5]/value/magnitude
    $item_ddd := /data/items/element[id6]/value/magnitude

    check $item_aaa > $item_bbb + $item_ccc + $item_ddd
end

Or if the end keyword is not desired, perhaps

$events := /events[id2]

For_all event in $events:
    ($item_aaa := $events/data/items/element[id7]/value/magnitude,
    $item_bbb := $events/data/items/element[id4]/value/magnitude,
    $item_ccc := $events/data/items/element[id5]/value/magnitude,
    $item_ddd := $events/data/items/element[id6]/value/magnitude)

    check $item_aaa > $item_bbb + $item_ccc + $item_ddd

although the second syntax is a bit less flexible when multiple statements need 
to be included within the for loop, you would need to repeat the for statement 
in that case.

The next problem you mentioned is:

PB: Note that a path that points to a single typed dvquantity in an archetype 
can still point to many items in the RM if somewhere up the tree there is a 
list or a set, for example more than one observation

So I think this implies an incorrect interpretation of this kind of code within 
an archetype. It can't be understood as simultaneously applying to multiple 
Observations if it is within an Observation archetype, only to one OBSERVATION 
instance at a time - usually one about to be committed.

You can still have Lists of things internal to the archetype, as shown above 
with the Events list, but to process the multiplicity, you would need to do as 
we have done and use for_all, or some other container-aware operator or 
function.

Of course, the context being a single instance of data – In case of an 
OBSERVATION a single OBSERVATION with possibly more than one event, or one 
event with a cluster that occurs twice. But in case of a COMPOSITION archetype 
it can be more than one OBSERVATION.

Anyway, does this get closer to the sense of what you would like to do? It's 
more than I had conceived of, so this is a useful challenge...

Much closer! Does not do everything (looks like not turing complete?) but 
functions will solve the remaining cases.

_______________________________________________
openEHR-technical mailing list
[email protected]
http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org

Re: Rules in archetypes - what are the requirements?

Reply via email to