Thanks Alex for diving into this first and stating things so eloquently. 
I agree with everything Alex put and would add: 

>If we move the records from VITAL to OBS_CLIN, we need to merge the 
>valuesets for the provenance fields. If we do that, OBSCLIN_SOURCE would 
>contain OD (Order/EHR), RG (Registry/ancillary system) and HC (Healthcare 
>delivery setting).
>There is a fair amount of overlap between these terms.  We are proposing to 
>deprecate OD and RG and utilize HC instead (we will make the same change to 
>OBSGEN_SOURCE as well). 
>Any concerns with this change?

Registries normally contain chart abstracted data which can be useful, but also 
adds an additional step for human error. I believe it would be useful to keep 
the distinction between potentially interpreted data and raw data from the EHR.

>Addition of Result_text
This value would likely be a free text field and this may allow PHI values to 
slip through. We would not recommend the addition of a free text field in a 
limited data set. 

>Addition of Raw Condition Text
This value is a free text field at one of our institutions and a value set at 
another. We could use the value set, but would not be able to add a free text 
string to a limited data set. 

-----Original Message-----
From: Stoddard, Alexander <[email protected]> 
Sent: Thursday, September 24, 2020 10:02 PM
To: [email protected]
Cc: Taylor, Bradley <[email protected]>; [email protected]; Manuel, Laura S M 
<[email protected]>
Subject: CDM 6.0 review responses from MCW

Hello GPC-DEV,

MCW agreed to review the CDM 6.0 spec during the dev call 2020-09-22. The 
replies to DRNOC, using an excel file template (available at 
https://pcornet.imeetcentral.com/drnoc-workgroups/folder/WzIwLDEzMTI2ODA5XQ/) , 
have been requested by end of day Friday 2020-09-25. 

Below are a text version of the responses that I will be sending on behalf of 
MCW.


Main questions seeking feedback
-------------------------------------

>As the CDM has grown in size, the image included in the specification (Page 9) 
> conveys less and less information.  
>Any concerns if it is deleted?   
Not a concern, but a highlighted list of changed tables/new columns on a single 
page is useful

>Suggestions on what we might consider as a replacement?
A machine readable, diff-able and version controlled schema definition would be 
very useful. Potentially this would allow tool assisted SQL generation for the 
different RDMS, or even visualization generation. A candidate for such a schema 
definition format would be that used by sql-alchemy python package: 
https://docs.sqlalchemy.org/en/13/core/metadata.html 

>Any there any concerns about the strategy to deprecate VITAL and move the 
>records to OBS_CLIN?
 OBS_CLIN is a much better data model for vitals but transitioning distinct 
columns in the VITALs table to a single column requiring different value-sets 
for different qualitative variables will be easier with a more agile and open 
process for value-set definition during the transition.  Open appending of 
additional values to a version controlled value set reference would offer 
projects much greater flexibility to adopt additional tests and observations 
throughout the CDM lifecycle without any loss of specificity, accuracy or 
backwards compatibility. This is especially true of _QUAL columns that will 
hold values for many different results/observations unlike domain specific 
columns historically defined using the current process (e.g. RACE in the 
DEMOGRAPHIC table and SMOKING in the VITALS table)

In general qualitative value-sets should be defined on the codes used to 
specify given observation rows, not the whole _QUAL column.

>If we move the records from VITAL to OBS_CLIN, we need to merge the 
>valuesets for the provenance fields. If we do that, OBSCLIN_SOURCE would 
>contain OD (Order/EHR), RG (Registry/ancillary system) and HC (Healthcare 
>delivery setting).
>There is a fair amount of overlap between these terms.  We are proposing to 
>deprecate OD and RG and utilize HC instead (we will make the same change to 
>OBSGEN_SOURCE as well). 
>Any concerns with this change? 
EHR vs Registry seems like a valid source distinction. From experience the 
source fields are most often useful for data tracing in QC operations on 
individual records, rather than research and aggregation of the data. A richer 
value-set may therefore be of benefit to sites.

>Is the description for Telehealth encounters sufficient, or is more detail 
>needed?
Description is sufficient but the real issue is likely the specificity with 
which these encounters (vs routine telephone or other electronic 
communications) are recorded in the source systems of sites.

>If we remove the VALUESET and VALUESET DESCRIPTOR columns from the 
>FIELDS tab of the parseable file, would that pose a problem? (The 
>VALUESETS tab would remain unchanged)
No problem. The data in these columns is much more easily used as represented 
in the VALUESETS tab. A flag or categorical value to indicate a field uses a 
valueset on the VALUESETS tab would be useful.

General Comments
---------------------
None

Value Sets
-----------
See comments on the VITALS transition. 

LAB_HISTORY table
---------------------
No particular issues with the schema definition. But MCW remains very dubious 
of the utility or accuracy possible with this table versus a centrally held one 
maintained by DRNOC.

If a lab test is stable enough and well defined enough for population reference 
ranges (but doesn't have individual test normal ranges defined for a particular 
source) then a centrally maintained reference fallback is reasonable.

When an assay does not have generalizable normal ranges, e.g. when run relative 
to a variable arbitrary reference and/or varying from machine to machine, then 
you really need a per record reference for the normal range and this table will 
be insufficiently granular and misleading.

The spec reads 'Every record in this table should be unique.' but this is 
trivially true given each row has an arbitrary LABHISTORYID and uniqueness is 
otherwise undefined.

New / Modified fields
------------------------
LAB_RESULT_CM   RESULT_TEXT         - Implementation concern - in MCW's 
experience SAS expands varchar columns to their maximum width, 
this will bloat table size if a column is sparsely populated with large 
records. Much more efficient would be a separate relational table with text 
results keyed by LAB_RESULT_CM_ID
ENCOUNTER       ENCOUNTER_TYPE         - No comment
ENCOUNTER       ADMITTING_SOURCE     - No comment
CONDITION       CONDITION_SOURCE     - Guidance on expected source of Chief 
Complaint would be useful, should it always be linked to an ENCOUNTER?
CONDITION       RAW_CONDITION_TEXT  - No comment
OBS_CLIN        OBSCLIN_START_DATE    - No comment
OBS_CLIN        OBSCLIN_START_TIME    - No comment
OBS_CLIN        OBSCLIN_STOP_DATE     - No comment
OBS_CLIN        OBSCLIN_STOP_TIME     - No comment
OBS_CLIN        OBSCLIN_SOURCE         - May be better to maintain EHR / 
Registry source distinction
OBS_CLIN        OBSCLIN_ABN_IND        - No comment
OBS_GEN OBSGEN_START_DATE    - No comment
OBS_GEN OBSGEN_START_TIME    - No comment
OBS_GEN OBSGEN_STOP_DATE     - No comment
OBS_GEN OBSGEN_STOP_TIME     -  No comment
OBS_GEN OBSGEN_SOURCE          - May be better to maintain EHR / Registry 
source distinction
OBS_GEN OBSGEN_ABN_IND         - No comment
OBS_GEN OBSGEN_TABLE_MODIFIED -  No comment
HARVEST CDM_VERSION               - No comment
HARVEST TOKEN_ENCRYPTION_KEY - Is a better name TOKEN_ENCRYPTION_KEY_NAME ? - 
Please give an example in guidance
HARVEST OBSCLIN_START_DATE_MGMT - No comment
HARVEST OBSCLIN_STOP_DATE_MGMT   - No comment
HARVEST OBSGEN_START_DATE_MGMT  - No comment
HARVEST OBSGEN_STOP_DATE_MGMT   - No comment

Best regards,
Alex Stoddard 

Programmer/Analyst Biomedical Informatics Clinical & Translational Science 
Institute Medical College of Wisconsin [email protected] 

I am currently working remotely
--------------------------------------------------


_______________________________________________
Gpc-dev mailing list
[email protected]
http://listserv.kumc.edu/mailman/listinfo/gpc-dev

Reply via email to