Trying to understand the openEHR Information Model

Thomas Beale Mon, 22 Apr 2013 22:19:06 +0100

On 22/04/2013 21:44, Bert Verhees wrote:
> On 04/22/2013 02:12 PM, Thomas Beale wrote:
>> On 22/04/2013 10:01, Bert Verhees wrote:
>
> But I understand your point, we can discuss that without bashing XML:
> You are saying that people may want to use another storage than 
> XML-databases, and than they can't use XQuery.
> You are right, but can they use AQL?
>
> There is only an incomplete definition of AQL in a Wiki, that had no 
> substantial changes since long time, thus hardly any progress.
> There is no guarantee that the Wiki is stable.


well it has been complete enough to be implemented and used in 
production systems for some years now. You are right, there are some 
unfinished bits, but they are not key elements - they don't prevent 
large scale systems using AQL.

>
> I think you know what kind of effort and the risk is to write a new 
> query-engine on a new language-concept for any database-concept of choice.
>
> Seref said it to Randolph a few days ago, there isn't hardly any work 
> done by third parties, only two implementations of AQL, and in the 
> same sentence he calls AQL the almost most important part of the 
> OpenEHR eco-system.
> Quote of Seref in this context:
>
>> In my humble opinion, AQL is the most neglected, yet, probably one of 
>> the most important components of an openEHR implementation. It is not 
>> part of the implementation, but it has been implemented by at least 
>> two vendors that I know of, with a third having something quite 
>> similar to it.
> One could, reading this, starting to doubt if OpenEHR can exist 
> without a query language,
> I think Seref is right. It cannot. And then there is no stable 
> specification?

well, again, the specification actually has been stable for a long time. 
It has not been made official like the other specifications (that should 
happen this year), and it probably should have been earlier, but I guess 
this way we have a lot of industry knowledge about it now, so we know it 
works.

>
> Also consider this.
> How can two companies have implemented AQL if there is no stable 
> definition?
> How much money do they put at stake with uncertain result?
> These are rhetorical questions.

it hasn't been a problem for the implementing companies.

>
> It brings me to the conclusion that for third parties, there is only 
> one way to go, and that is XML, and XQuery, there is no other way to 
> get an OpenEHR system ready at this time and the coming few years.

I don't understand why you would say that, there are many already 
running. This page 
<http://www.openehr.org/who_is_using_openehr/healthcare_providers_and_authorities>
 
documents systems in production in clinical environments.

> The query language is one difficult part, the other difficult part is 
> validation. Both can be solved using standard industry-tools, I come 
> back to this at the end of this message.

An AQL implementation is actually a lot easier than you think, assuming 
that the main data are stored in blobs.

> And I am not talking about MLHIM. ;-)
>
> The OpenEHR eco-system for XML is ready and full of features.
>
> I don't say, XML is the only way, to write kernel. But it has many 
> advantages, because of the wide industry-support, and the thousands of 
> man-years development in that.
> Choosing any other solution means having to write an query engine for 
> a query language which still is not declared stable, and having to 
> write a validation-tool which, as far as I know, only exist for DADL.
>
> Implementing OpenEHR for a software-vendor, not using XML, is hardly 
> an option.

that's not at all the case. It's perfectly normal to implement the whole 
system in Java, C#, Python, Ruby, whatever, and use numerous kinds of 
native storage, object storage, it could be MUMPs, relational+blob 
storage, XML as well. But there is nothing that I can think of in XML 
technology that makes it more attractive than anything else as a basis 
for implementing a core system (it's more or less unavoidable on for 
interfacing). XML is one option, there are many others, and they work well.

>
>>
>> The general need we have in openEHR is for an abstract query language 
>> that can be used to express queries to any openEHR (or 13606 or other 
>> archetype-based system), regardless of whether its concrete 
>> persistence happens to be in XML.
>> If you are suggesting that we use Xquery/Xpath even for non-XML data 
>> representation cases, that's a different conversation. It won't work 
>> out of the box, because we use a more efficient path syntax (but 
>> which is easily convertible), and Xquery/Xpath make other assumptions 
>> due to being targetted to XML, e.g. they assume the XML 
>> attribute/element dichotomy, which doesn't exist in normal object 
>> data; they don't assume an object inheritance model, and so on.
>
> By chance, tomorrow I go to Intersystems, for a technical introduction 
> for Cache and tooling.
> I am specially interested in (proprietary) path-based query-formalism 
> they support.
>
> I ask them for XQuery-support. I've read on their website, it was 
> possible.
> It is not surprising when their proprietary path-based query-formalism 
> is very much like XPath.

same as AQL in fact. There is no getting away from paths ;-)

>
> This is because how can a serious database-vendor nowadays live 
> without XQuery-support?
> All big database-vendors support XML-structures, and they also support 
> XQuery.
> Check Microsoft, check Oracle, XML is here to stay, and that is so 
> since 15 years.
>
> XPath2.0 (which is a subset of XQuery 1.0) is very similar to the 
> path-based AQL, easily convertible, as you call it.

I'm not really disagreeing here, it's just that noone has done a recent 
analysis on whether Xquery or a subset will do what AQL does.

>
>
> It is very difficult to do something like that. It will cost 
> man-months/years to get it fast performing and more or less bug-free.

you are over-estimating the difficulty. The only thing that would make 
it hard is using an orthodox 3rd NF relation table design, but that 
should be avoided anyway, for multiple reasons.

Connecting AQL to a well-designed back-end is really not hard. 
Optimising takes some work, but that's to be expected.

>
> The easy part, simple selects will take a few months, but then, 
> optimizing in different kind of indexes, also user defined indexes, 
> multi-user, unions, sub-selects, aggregations, authorization.
> It is not easy at all, and I would definitely not advise a company to 
> go this way.

Apparently some successful companies did not take your advice ;-)

>
>
>> Do you mean just that the Release 1.0.2 XSDs need to be better 
>> designed? We certainly know that, and welcome any proposals on that 
>> (of which there are already many).
>
> No, I mean that it is impossible to represent RM 1.0.2 in W3C XML 
> Schema. It is unusable.
> You cannot validate any XML-dataset modeled from an CKM archetype 
> against the XSD's on the OpenEHR website regardless of the 
> XML-Schema-version.
> It is simply impossible, illegal. OpenEHR is breaking several 
> XML-Schema rules.

which rules is it breaking? As far as I know, openEHR XML documents 
validate normally against the schemas.

> XML Schema in any version is not ready for multi level modeling.
>
> With some tricks it can be done, I do that now, but that is not very 
> elegant.

well, we already had that debate. It's not what we use it for - we don't 
do any 'modelling' in XSD, it's just an interoperability schema.

>
> But I have not found any reason why it cannot be defined in RelaxNG, 
> which is a widely used Oasis standard.
> But, I must admit, I am not  completely ready researching this, but I 
> am for more 80%.
> It relaxes on the points where XML Schema has its blocking restrictions.
> It looks promising, I will let you know, I think, end next month, when 
> I start working again on this.

I am inclined to agree with this, from my limited research into Relax 
NG, it seems significantly better designed than XSD.

>
> My goal however is not only to represent OpenEHR in a schema-language, 
> but everything that can be defined in ADL 1.4, so including OpenEHR.
> And the translation from ADL to schema needs to be done automatically.

I think you should target ADL 1.5, because ADL 1.4 has one or two errors 
in it, plus quite a few limitations.

>
> Oasis, as you will know is an industry standardization organization, 
> it is Domain Member of OMG, and it is also sponsored by OMG (and the 
> members of OMG).
> There are several RelaxNG schema definitions which made it to 
> ISO-standard, it is stable for many years now.
>
>
>>
>>> So defining the RM in a XML-Schema is quite useless, and bringing 
>>> people on a dead end street. There are, however good alternatives, 
>>> even better.
>>
>> Not sure what you are saying here, Bert. XML openEHR data is 
>> regularly used as an exchange format for applications and systems. 
>> Can you explain a bit better what you mean by the above?
>
> I am sorry to say.
>
> Writing a W3C XML Schema representing an archetype, and conforming the 
> base-schema's published on the OpenEHR website.
> It is not possible, not even for one single archetype from CKM.
> So the XSD's are useless, meaning, there is no way they are useful.

but they are widely used. So there is something not right in what you 
are saying. Are you referring to the XSDs for archetypes, or for the RM?

>
> The conversion to the exchange-format cannot be validated against a 
> constrained schema representing the archetype in which they are 
> defined. I am pretty sure in this.
> You know what you have for source-data, maybe objects in Cache, or 
> DADL or path/value-combinations.
>
> But you don't know if the target/exchange XML-data are still valid.
> You can guess they are, but you cannot proof they will always be valid.
> I think, validation after data-transforming is very important. A guess 
> should not be good enough.

ok - so here, you mean - how can it be proved that the XSD-based XML 
representation of the data are in fact a faithful representation of the 
original object data? As far as I know, the conversion of object to XML 
based on the XSDs is not 100% lossless in all cases. That's the price of 
using XSD, but unfortunately huge parts of industry want it.

If you are saying that we should publish Relax NG schemas and advertise 
those as being 'safe to use', I say: let's do it. I'm all for that.

- thomas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://lists.openehr.org/pipermail/openehr-technical_lists.openehr.org/attachments/20130422/65c02d9b/attachment.html>

Trying to understand the openEHR Information Model

Reply via email to