Re: FW: A Fresh Look Proposal (HL7)

Mark Sun, 21 Aug 2011 16:10:26 -0700

I want to print this message on a T-shirt and wear it everywhere I go!Fantastic :-)


Well said, Jim!


Mark

On Sun, 21 Aug 2011 15:12:00 -0700, Jim McCusker <james.mccus...@yale.edu>wrote:

I feel I need to cut to the chase with this one: XML schema cannotvalidate
semantic correctness.

It can validate that XML conforms to a particular schema, but that is
syntactic. The OWL validator is nothing like a schema validator, first it
produces a closure of all statements that can be inferred from theasserted
information. This means that if a secondary ontology is used to describe
some data, and that ontology integrates with the ontology that you're
attempting to validate against, you will get a valid result. An XMLschema
can only work with what's in front of it.
Two, there are many different representations of information that gobeyondXML, and it should be possible to validate that information withoutanything
other than a mechanical, universal translation. For instance, there are a
few mappings of RDF into JSON, including JSON-LD, which looks the most
promising at the moment. Since RDF/XML and JSON-LD both parse to the same
abstract graph, there is a mechanical transformation between them. When
dealing with semantic validity, you want to check the graph that isparsed
from the document, not the document itself.
The content matters, the format does not. For instance, let me define anew
RDF format called RDF/CSV:

First column is the subject. First row is the predicate. All other cell
values are objects. URIs that are relative are relative to the document,as
in RDF/XML.

I can write a parser for that in 1 hour and publish it. It's genuinely
useful, and all you would have to do to read and write it is to use my
parser or write one yourself. I can then use the parser, paired withPellet
ICV, and validate the information in the file without any additional work
from anyone.
Maybe we need a simplified XML representation for RDF that looks morelikeregular XML. But to make a schema for an OWL ontology is too much workfor
too little payoff.

Jim
On Sun, Aug 21, 2011 at 5:45 PM, Hau, Dave (NIH/NCI) [E]<ha...@mail.nih.gov
wrote:
Hi all,****

** **

As some of you may have read, HL7 is rethinking their v3 and doing some
brainstorming on what would be a good replacement for a data exchange
paradigm grounded in robust semantic modeling.****

** **

On the following email exchange, I was wondering, if OWL is used for
semantic modeling, are there good ways to accomplish the following:****

** **

1.  Generate a wire format schema (for a subset of the model, the subset
they call a "resource"), e.g. XSD****

** **

2.  Validate XML instances for conformance to the semantic model.  (Here
I'm reminded of Clark and Parsia's work on their Integrity Constraint
Validator:  http://clarkparsia.com/pellet/icv )****

** **
3. Map an XML instance conformant to an earlier version of the"resource"
to the current version of the "resource" via the OWL semantic model****

** **
I think it'd be great to get a semantic web perspective on this freshlook
effort.****

** **

Cheers,****

Dave****

** **

** **

** **

Dave Hau****

National Cancer Institute****

Tel: 301-443-2545****

dave....@nih.gov****

** **

** **

** **

*From:* owner-...@lists.hl7.org [mailto:owner-...@lists.hl7.org] *On
Behalf Of *Lloyd McKenzie
*Sent:* Sunday, August 21, 2011 12:07 PM
*To:* Andrew McIntyre
*Cc:* Grahame Grieve; Eliot Muir; Zel, M van der; HL7-MnM; RIMBAA; HL7ITS
*Subject:* Re: A Fresh Look Proposal****

** **

Hi Andrew,****

** **
Tacking stuff on the end simply doesn't work if you're planning to useXMLSchema for validation. (Putting new stuff in the middle or thebeginninghas the same effect - it's an unrecognized element.) The onlyalternativeis to say that all changes after "version 1" of the specification willbedone using the extension mechanism. That will create tremendousanalysisparalysis as we try to get things "right" for that first version, andwill
result in increasing clunkiness in future versions.  Furthermore, the
extension mechanism only works for the wire format.  For the RIM-based
description, we still need proper modeling, and that can't work with"stick
it on the end" no matter what.****

** **
That said, I'm not advocating for the nightmare we currently have withv3
right now.****

** **

I think the problem has three parts - how to manage changes to the wire
format, how to version resource definitions and how to manage changesto the
semantic model.****

** **

Wire format:****

If we're using schema for validation, we really can't change anything
without breaking validation. Even making an existing non-repeatingelementrepeat is going to cause schema validation issues. That leaves us withtwooptions (if we discount the previously discussed option of "get itright the
first time and be locked there forever":****

1. Don't use schema****
- Using Schematron or something else could easily allow validation ofthe
elements that are present, but ignore all "unexpected" elements****
- This would cause significant pain for implementers who like to useschema
to help generate code though****

** **

2. Add some sort of a version indicator on new content that allows a
pre-processor to remove all "new" content if processing using an "old"
handler****

- Unpleasant in that it involves a pre-processing step and adds extra
"bulk" to the instances, but other than that, quite workable****

** **
I think we're going to have to go with option #2. It's not ideal, butisstill relatively painless for implementers. The biggest thing is thatwe
can insist on "no breaking x-path changes".  We don't move stuff between
levels in a resource wire format definition or rename elements in aresource
wire format definition.  In the unlikely event we have to deprecate the
entire resource and create a new version.****

** **

Resource versioning:****
At some point, HL7 is going to find at least one resource where we blewitwith the original design and the only way to create a coherent wireformat
is to break compatibility with the old one.  This will then require
definition of a new resource, with a new name that occupies the same
semantic space as the original. I.e. We'll end up introducing"overlap".Because overlap will happen, we need to figure out how we're going todealwith it. I actually think we may want to introduce overlap in someplaces
from the beginning.  Otherwise we're going to force a wire format on
implementers of simple community EMRs that can handle prescriptions for
fully-encoded chemo-therapy protocols. (They can ignore some of thedataelements, but they'd still have to support the full complexity of thenested
data structures.)****

** **

I don't have a clear answer here, but I think we need to have a serious
discussion about how we'll handle overlap in those cases where it's
necessary, because at some point it'll be necessary. If we don'tfigure out
the approach before we start, we can't allow for it in the design.****

** **

All that said, I agree with the approach of avoiding overlap as much as
humanly possible.  For that reason, I don't advocate calling the Person
resource "Person_v1" or something that telegraphs we're going to havenew
versions of each resource eventually (let alone frequent changes).
Introduction of a new version of a resource should only be done whenthepain of doing so is outweighed by the pain of trying to fit new contentin
an old version, or requiring implementers of the simple to support the
structural complexity of our most complex use-cases.****

** **

** **

Semantic model versioning:****

This is the space where "getting it right" the first time is the most
challenging. (I think we've done that with fewer than half of thenormativespecifications we've published so far.) V3 modeling is hard. Thepositivething about the RFH approach is that very few people need to care. Wecould
totally refactor every single resource's RIM-based model (or even remove
them entirely), and the bulk of implementers would go on merrilyexchanging
wire syntax instances.  However, that doesn't mean the RIM-based
representations aren't important. They're the foundation for themeaning of
what's being shared.  And if you want to start sharing at a deeper level
such as RIMBAA-based designs, they're critical. This is the levelwhere OWLwould come in. If you have one RIM-based model structure, and thenneed torefactor and move to a different RIM-based model structure, you'regoing towant to map the semantics between the two structures so that anyone whowas
using the old structure can manage instances that come in with the new
structure (or vice versa). OWL can do that. And anyone who's got acomplex
enough implementation to parse the wire format and trace the elements
through the their underlying RIM semantic model will likely be capableof
managing the OWL mapping component as well.****

** **

** **

In short, I think we're in agreement that separation of wire syntax and
semantic model are needed. That will make model refactoring mucheasier.
 However we do have to address how we're going to handle wire-side and
resource refactoring too.****

** **

** **

Lloyd****

--------------------------------------
Lloyd McKenzie

+1-780-993-9501



Note: Unless explicitly stated otherwise, the opinions and positions
expressed in this e-mail do not necessarily reflect those of my clientsnor
those of the organizations with whom I hold governance positions.

****

On Sun, Aug 21, 2011 at 7:53 AM, Andrew McIntyre <
and...@medical-objects.com.au> wrote:****

Hello Lloyd,
While "tacking stuff on the end" in V2 may not at first glance seemlike anelegant solution I wonder if it isn't actually the best solution, andone
that has stood the test of time. The parsing rules in V2 do make version
updates quite robust wrt backward and forward inter-operability.
I am sure it could be done with OWL but I doubt we can switch the worldtousing OWL in any reasonable time frame and we probably need a lessabstract
representation for commonly used things. In V2 OBX segments, used in a
hierarchy can create an OWL like object-attribute structure forinformation
that is not modeled by the standard itself.

I do think the wire format and any overlying model should be distinct
entities so that the model can be evolved and the wire format bechanged in
a backward compatible way, at least for close versions.
I also think that the concept of templates/archetypes to extend themodelshould not invalidate the wire format, but be a metadata layer over thewireformat. This is what we have done in Australia with an ISO 13606Archetypes
in V2 projects. I think we do need a mechanism for people to develop
templates to describe hierarchical data and encode that in the wireformatin a way that does not invalidate its vanilla semantics (ie nontemplated V2
semantics) when the template mechanism is unknown or not implemented.

In a way the V2 specification does hit at underlying objects/Interfaces,
and there is a V2 model, but it is not prescriptive and there is no
requirement for systems to use the same internal model as long as theyusethe bare bones V2 model in the same way. Obviously this does not alwaysworkas well as we would like, even in V2, but it does work well enough touse it
for quite complex data when there are good implementation guides.

If we could separate the wire format from the clinical models then the 2
can evolve in their own way. We have done several trial implementationsof
Virtual Medical Record Models (vMR) which used V3 datatypes and RIM like
classes and could build those models from V2 messages, or in some casesnonstandard Web Services, although for specific clinical classes did useISO
13606 archetypes to structure the data in V2 messages.

I think the dream of having direct model serializations as messages is
flawed for all the reasons that have made V3 impossible to implement inthe
wider world. While the tack it on the end, lots of optionality rationale
might seem clunky, maybe its the best solution to a difficult problem.If wedefine tight SOAP web services for everything we will end up withthousandsof slightly different SOAP calls for every minor variation and I am notsure
this is the path to enlightenment either.

I am looking a Grahams proposal now, but I do wonder if the start again
from scratch mentality is not part of the problem. Perhaps that is alessonto be learned from the V3 process. Maybe the problem is 2 complex tosolvefrom scratch and like nature we have to evolve and accept there is lotsofjunk DNA, but maintaining a working standard at all times is the onlyway to
avoid extinction.

I do like the idea of a cohesive model for use in decision support, and
that's what the vMR/GELLO is about, but I doubt there will ever be a one
size fits all model and any model will need to evolve. Disconnecting the
model from the messaging, with all the pain that involves, might createalayered approach that might allow the HL7 organism to evolvegracefully. I
do think part of the fresh look should be education on what V2 actually
offers, and can offer, and I suspect many people in HL7 have neverseriously
looked at it in any depth.

Andrew McIntyre****



Saturday, August 20, 2011, 4:37:37 AM, you wrote:****

Hi Grahame,

Going to throw some things into the mix from our previous discussions
because I don't see them addressed yet. (Though I admit I haven'trereadthe whole thing, so if you've addressed and I haven't seen, just pointme at
the proper location.)

One of the challenges that has bogged down much of the v3 work at the
international level (and which causes a great deal of pain at the
project/implementation level) is the issue of refactoring. The pain atthe
UV level comes from the fact that we have a real/perceived obligation to
meet all known and conceivable use-cases for a particular domain.  For
example, the pharmacy domain model needs to meet the needs of clinics,
hospitals, veterinarians, and chemotherapy protocols and must supportthe
needs of the U.S., Soviet union and Botswana.  To make matters more
interesting, participation from the USSR and Botswana is a tad light.
However the fear is that if all of these needs aren't taken intoaccount,then when someone with those needs shows up at the door, the model willneed
to undergo substantive change, and that will break all of the existing
systems.

The result is a great deal of time spent gathering requirements and
refactoring and re-refactoring the model as part of the design process,
together with a tendency to make most, if not all data elementsoptional atthe UV level. A corollary is that the UV specs are totallyunimplementablein an interoperable fashion. The evil of optionality that manifestedin v2that v3 was going to banish turned out to not be an issue of thestandard,
but rather of the issues with creating a generic specification that
satisfies global needs and a variety of use-cases.
The problem at the implementer/project level is that when you take theUVmodel and tightly constrain it to fit your exact requirements, youdiscover6 months down the road that one or more of your constraints was wrongandyou need to loosen it, or you have a new requirement that wasn'tthought of,
and this too requires refactoring and often results in wire-level
incompatibilities.

One of the things that needs to be addressed if we're really going to
eliminate one of the major issues with v3 is a way to reduce the fear of
refactoring. Specifically, it should be possible to totally refactorthemodel and have implementations and designs work seemlessly acrossversions.
I think putting OWL under the covers should allows for this.  If we can
assert equivalencies between data elements in old and new models, orevenjust map the wire syntaxes of old versions to new versions of thedefinition
models, then this issue would be significantly addressed:
- Committees wouldn't have to worry about satisfying absolutely every
use-case to get something useful out because they know they can makechangeslater without breaking everything. (They wouldn't even necessarilyhave to
meet all the use-cases of the people in the room! :>)
- Realms and other implementers would be able to have aninteroperability
path that allowed old wire formats to interoperate with new wireformats
through the aid of appropriate tooling that could leverage the OWLunder the
covers.  (I think creating such tooling is *really* important because
version management is a significant issue with v3.  And with XML and
schemas, the whole "ignore everything on the end you don't recognize"from
v2 isn't a terribly reasonable way forward.

I think it's important to figure out exactly how refactoring and version
management will work in this new approach. The currently proposedapproach
of "you can add stuff, but you can't change what's there" only scales so
far.


I think we *will* need to significantly increase the number of Resources
(from 30 odd to a couple of hundred).  V3 supports things like invoices,
clinical study design, outbreak tracking and a whole bunch of other
healthcare-related topics that may not be primary-care centric but arestillhealthcare centric. That doesn't mean all (or even most) systems willneedto deal with them, but the systems that care will definitely needthem. The
good news is that most of these more esoteric areas have responsible
committees that can manage the definition of these resources, and as you
mention, we can leverage the RMIMs and DMIMs we already have in defining
these structures.


The specification talks about robust capturing of requirements and
traceability to them, but gives no insight into how this will occur.It'ssomething we've done a lousy job of with v3, but part of the reason forthatis it's not exactly an easy thing to do. The solution needs to fleshout
exactly how this will happen.


We need a mapping that explains exactly what's changed in the datatypes
(and for stuff that's been removed, how to handle that use-case).

There could still be a challenge around granularity of text.  As I
understand it, you can have a text representation for an attribute, orfor
any XML element.  However, what happens if you have a text blob in your
interface that covers 3 of 7 attributes inside a given XML element.  You
can't use the text property of the element, because the text onlycovers 3of 7. You can't use the text property of one of the attributes becauseitcovers 3 separate attributes. You could put the same text in each ofthe 3
attributes, but that's somewhat redundant and is going to result in
rendering issues. One solution might be to allow the text specified atthe
element level to identify which of the attributes the text covers.  A
rendering system could then use that text for those attributes, and then
render the discrete values of the remaining specified attributes. Whatthiswould mean is that an attribute might be marked as "text" but not havetext
content directly if the parent element had a text blob that covered that
attribute.



New (to Grahame) comments:
I didn't see anything in the HTML section or the transaction section onhow
collisions are managed for updates.  A simple requirement (possibly
optional) to include the version id of the resource being updated ordeleted
should work.
To my knowledge, v3 (and possibly v2) has never supported true"deletes".At best, we do an update and change the status to nullified. Is thattheintention of the "Delete" transaction, or do we really mean a true"Delete"?
 Do we have any use-cases for true deletes?
I wasn't totally clear on the context for uniqueness of ids. Is itwithin
a given resource or within a given base URL?  What is the mechanism for
referencing resources from other base URLs? (We're likely to havenetworks
of systems that play together.)

Nitpick: I think "id" might better be named "resourceId" to avoid any
possible confusion with "identifier".  I recognize that from a coding
perspective, shorter is better. However, I think that's outweightd bythe
importance of avoiding confusion.

In the resource definitions, you repeated definitions for resources
inherited from parent resources.  E.g. Person.created inherited from
Resource.Base.created.  Why?  That's a lot of extra maintenance and
potential for inconsistency.  It also adds unnecessary volume.
Suggest adding a caveat to the draft that the definitions areplaceholdersand will need significant work. (Many are tautological and none meetthe
Vocab WG's guidelines for quality definitions.)

Why is Person.identifier mandatory?

You've copied "an element from Resource.Base.???" to all of the Person
attributes, including those that don't come from Resource.Base.
Obviously the workflow piece and the conformance rules that go alongwithit need some fleshing out. (Looks like this may be as much fun in v4as it
has been in v3 :>)

The list of identifier types makes me queasy.  It looks like we're
reintroducing the mess that was in v2. Why? Trying to maintain anontology
of identifier types is a lost cause.  There will be a wide range of
granularity requirements and at fine granularity, there will be 10s of
thousands. The starter list is pretty incoherent. If you're going tohavetypes at all, the vocabulary should be constrained to a set of codesbasedon the context in which the real-world identifier is present. Ifthere's novocabulary defined for the property in that context, then you can usetext
for a label and that's it.

I didn't see anything on conformance around datatypes.  Are we going to
have datatype flavors? How is conformance stated for datatypeproperties?
I didn't see templateId or flavorId or any equivalent.  How do instances
(or portions there-of) declare conformance to "additional" constraint
specifications/conformance profiles than the base one for thatparticular
server?

We need to beef up the RIM mapping portion considerably.  Mapping to a
single RIM class or attribute isn't sufficient.  Most of the time, we're
going to need to map to a full context model that talks about the
classCodes, moodCodes and relationships.  Also, you need to relate
attributes to the context of the RIM location of your parent.

There's no talk about context conduction, which from an implementation
perspective is a good thing. However, I think it's still needed behindthe
scenes.  Presumably this would be covered as part of the RIM semantics
layer?

In terms of the "validate" transaction, we do a pseudo-validate in
pharmacy, but a 200 response isn't sufficient.  We can submit a draft
prescription and say "is this ok?".  The response might be as simple as
"yes" (i.e. a 200). However, it could also be a "no" or "maybe" with alist
of possible contraindications, dosage issues, allergy alerts and other
detected issues.  How would such a use-case be met in this paradigm?
At the risk of over-complicating things, it might be useful to thinkaboutdata properties as being identifying or not to aid in exposingresources ina de-identified way. (Not critical, just wanted to plant the seed inyour
head about if or how this might be done.)


All questions and comments aside, I definitely in favour of fleshing out
this approach and looking seriously at moving to it. To that end, Ithink
we need a few things:
- A list of the open issues that need to be resolved in the newapproach.(You have "todo"s scattered throughout. A consolidated list of the"big"
things would be useful.)
- An analysis of how we move from existing v3 to the new approach, bothinterms of leveraging existing artifacts and providing a migration pathfor
existing solutions as well as what tools, etc. we need.
- A plan for how to engage the broader community for review.  (Should
ideally do this earlier rather than later.)

Thanks to you, Rene and others for all the work you've done.


Lloyd

--------------------------------------
Lloyd McKenzie

+1-780-993-9501



Note: Unless explicitly stated otherwise, the opinions and positions
expressed in this e-mail do not necessarily reflect those of my clientsnor
those of the organizations with whom I hold governance positions.
On Fri, Aug 19, 2011 at 9:08 AM, Grahame Grieve<grah...@kestral.com.au***
*

> wrote:****


hi All

Responses to comments

#Michael

> 1. I would expect more functional interface to use these resources.

as you noted in later, this is there, but I definitely needed to make
more of it. That's where I ran out of steam

> 2. One of the things that was mentioned (e.g. at the Orlando
> WGM RIMBAA Fresh Look discussion) is that we want to use
> industry standard tooling, right? Are there enough libraries that
> implement REST?

this doesn't need tooling. There's schemas if you want to bind to them

> 2b. A lot of vendors now implement WebServices. I think we should
> go for something vendors already have or will easilly adopt. Is thatthe
case with REST?

Speaking as a vendor/programmer/writer of an open source web services
toolkit, I prefer REST. Way prefer REST

> Keep up the good work!

ta

#Mark

> I very much like the direction of this discussion towards web services
> and in particular RESTful web services.

yes, though note that REST is a place to start, not a place to finish.

> At MITRE we have been advocating this approach for some time with our
hData initiative.

yes. you'll note my to do: how does this relate to hData, which is a
higher level
specification than the CRUD stuff here.

#Eliot
> Hats off - I think it's an excellent piece of work and definitely astep
in right direction.

thanks.

> I didn't know other people in the HL7 world other than me were talking
about
> (highrise).  Who are they?
not in Hl7. you were one. it came up in some other purely IT placesthat I
play

>  5) Build it up by hand with a wiki - it is more scalable really since
you

wiki's have their problems, though I'm not against them.
> 1) I think it would be better not to use inheritance to define apatient
as
> a sub type of a person.  The trouble with that approach is that people
can

On the wire, a patient is not a sub type of person. The relationship
between the two is defined in the definitions.
> A simpler approach is associate additional data with a person if andwhen
> they become a patient.
in one way, this is exactly what RFH does. On the other hand, itcreates a
new identity for the notion of patient (for integrity). We can discuss
whether that's good or bad.

> 2) I'd avoid language that speaks down to 'implementers'.  It's
enterprise

really? Because I'm one. down the bottom of your enterprise pole. And
I'm happy to be one of those stinking implementers down in the mud.
I wrote it first for me. But obviously we wouldn't want to causeoffense.
I'm sure I haven't caused any of that this week ;-)
> 3) If you want to reach a broader audience, then simplify thelanguage.
argh, and I thought I had. how can we not use the right terms? But I
agree that the introduction is not yet direct enough - and that's after
4 rewrites to try and make it so....

Grahame


************************************************
To access the Archives of this or other lists or change your listsettings
and information, go to: ****

http://www.hl7.org/listservice****



************************************************
To access the Archives of this or other lists or change your listsettings
and information, go to: http://www.hl7.org/listservice****





*--
Best regards,
Andrew*mailto:and...@medical-objects.com.au<and...@medical-objects.com.au>
*sent from a real computer*****

** **

** **

****************************************************
To access the Archives of this or other lists or change your listsettings and information, go to: http://www.hl7.org/listservice****

Re: FW: A Fresh Look Proposal (HL7)

Reply via email to