Had to make this a separate discussion about the data synth to 
distinguish it from the Oracle XML DB problems.

If you have access to primary data, e.g. the usual Pat_ID, Timestamp, 
Code, Value, you can create very simple generative models by training a 
neural network to generate sequences with similar statistics. In this 
case, the time parameter, code covariance and code value dynamics (for 
those that have a value attached to them) are all lumped together.

You can set inclusion/exclusion criteria that specify a sub-population 
and have the neural network be trained and generate data for that 
specific population.

As a practical example, in a pilot study of a few thousand elderly 
patients, we found that along with dementia, it was inevitable to be 
getting significant cardiovascular problems and also, a lot of...flu.

It's a crude way to be training a network to produce similar sets of 
codes and values but the dream of specifying patient data as a linear 
mixture of the profiles of different conditions is a bit far off yet :)

Of course, when this is done, we are still left with mapping a given 
clinical encoding scheme to the suitable openEHR archetype. (If we are 
looking at a general solution that is).

All the best
Athanasios






On 16/04/2015 10:22, Bert Verhees wrote:
> On 16-04-15 11:13, Thomas Beale wrote:
>>
>> Indeed, it would be a great thing. The reason it doesn't exist so far,
>> is that to be useful we need synthesised data sets that have some
>> realistic statistical spread of values. Since we are talking at
>> multiple levels - not just vital signs measurements, but covariance of
>> all kinds of measurements with assessments (diagnosis etc), plans and
>> orders and actions, the complexity is not trivial.
>>
>> A data synthesiser to do this for openEHR would be a fantastic
>> Master's project (hint :).
>
> I use Oxygen, it can generate XML instances to XML Schema's, but first
> we need to change the data-element of version to have type Locatable, or
> Composition.
> If wanted, I can generate them too, it is only one minute work.
>
> Bert
>>
>> - thomas
>>
>> On 16/04/2015 10:02, Dmitry Baranov wrote:
>>> Diego,
>>> that'll be great.
>>> Hope that OpenEHR github owners will provide us with an instance
>>> samples repository some day or other :)
>>>
>>>> I can generate random sample instances from current archetypes for you
>>>> if you need them. Generated data may not make much sense as it only
>>>> tries to follow the archetype constraints, but it should be enough for
>>>> application testing and benchmark
>>
>>
>>
>> _______________________________________________
>> openEHR-technical mailing list
>> openEHR-technical at lists.openehr.org
>> http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org
>>
>
>
> _______________________________________________
> openEHR-technical mailing list
> openEHR-technical at lists.openehr.org
> http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org
>

Reply via email to