I just added Russ and Chuck to our Google Drive shared folder for our i2b2-to-CDM transform. (You already have access to our ontology.) Happy to add anyone else who like to work with it.
I was able to adapt some of the GPC transform code to our network (the VITALS transform, I think) without too much work, so our approaches are not dissimilar. Although a grand unified transform script is probably not worth the effort at this point, I’d very much look forward to coding collaboration as we all develop v2/v3 transforms. I have not attempted this yet but I plan to dive in soon. Also from what I’ve heard – the unique numbers in each row found in v3 are, according to DSSNI, due to a requirement of the Microsoft Entity Framework, which they’ll be using to make a GUI query builder for PopMedNet for Phase II. Thanks, Jeff K. Jeffrey Klann, PhD Instructor of Medicine, Harvard Medical School Assistant in Computer Science, Massachusetts General Hospital PhD in Research, Partners Healthcare Research Computing ofc: 617-643-5879 email: jkl...@partners.org From: Russ Waitman [mailto:rwait...@kumc.edu] Sent: Thursday, May 21, 2015 10:14 AM To: 'Mandl, Kenneth'; Borromeo, Charles; 'Campbell, James R'; GPC-DEV@LISTSERV.KUMC.EDU; 'McClay, James C (jmcc...@unmc.edu)' Cc: Murphy, Shawn N.; Rachel Hess (rachel.h...@hsc.utah.edu); Klann, Jeffrey G. Subject: RE: PCORI CDM V3 vote Thanks for sharing the PaTH perspective Chuck. As previously mentioned, GPC code for this has been up for others to review and contribute https://bitbucket.org/gpcnetwork/gpc-pcornet-cdm Also largely SQL and py. Other related PCORnet code is at the higher level folder; recent work on QA of data for the cohort and survey work https://bitbucket.org/gpcnetwork Russ From: Mandl, Kenneth [mailto:kenneth.ma...@childrens.harvard.edu] Sent: Thursday, May 21, 2015 9:06 AM To: Borromeo, Charles; Russ Waitman; 'Campbell, James R'; GPC-DEV@LISTSERV.KUMC.EDU<mailto:GPC-DEV@LISTSERV.KUMC.EDU>; 'McClay, James C (jmcc...@unmc.edu<mailto:jmcc...@unmc.edu>)' Cc: Shawn N. Murphy (snmur...@partners.org<mailto:snmur...@partners.org>); Rachel Hess (rachel.h...@hsc.utah.edu<mailto:rachel.h...@hsc.utah.edu>); Jeff Klann Subject: Re: PCORI CDM V3 vote Looping Jeff Klann in. Suggest we continue to collaborate across CDRNs on creating, maintaining and amending the scripts for transformation of i2b2 to CDM. Kenneth D. Mandl, MD, MPH Professor, Harvard Chair in Biomedical Informatics and Population Health Children’s Hospital Informatics Program | Boston Children's Hospital Center for Biomedical Informatics | Harvard Medical School http://scholar.harvard.edu/mandl Twitter: @mandl From: <Borromeo>, Charles <ch...@pitt.edu<mailto:ch...@pitt.edu>> Date: Thursday, May 21, 2015 at 9:59 AM To: Russel Waitman <rwait...@kumc.edu<mailto:rwait...@kumc.edu>>, James Campbell <campb...@unmc.edu<mailto:campb...@unmc.edu>>, "GPC-DEV@LISTSERV.KUMC.EDU<mailto:GPC-DEV@LISTSERV.KUMC.EDU>" <GPC-DEV@LISTSERV.KUMC.EDU<mailto:GPC-DEV@LISTSERV.KUMC.EDU>>, "'McClay, James C (jmcc...@unmc.edu<mailto:jmcc...@unmc.edu>)'" <jmcc...@unmc.edu<mailto:jmcc...@unmc.edu>> Cc: "Kenneth D. Mandl" <kenneth.ma...@childrens.harvard.edu<mailto:kenneth.ma...@childrens.harvard.edu>>, Shawn Murphy <snmur...@partners.org<mailto:snmur...@partners.org>>, Rachel Hess <rachel.h...@hsc.utah.edu<mailto:rachel.h...@hsc.utah.edu>> Subject: Re: PCORI CDM V3 vote Hi Russ, We met with DSSNI on Monday. The PaTH CDRN shares your concerns about the non-EAV structure of the data model. Dr. Chris Chute (recently joined JHU) also thinks the CDM is very brittle. However, PaTH never dedicated time to developing a viable alternative to the CDM. It seemed like too big of a change to discuss given the CDM 3.0 approval date of May 2015. I did discuss a short-term flaw with Jeff, Leslie, Laura, and Rich Platt. In PaTH, I am developing some Python scripts to convert our i2b2 data into the CDM. According to DSSNI, the CDRNs should deploy 2 DataMarts: one with EMR data and one with Claims data. Deploying two DataMarts allows the CDRNs to avoid the issue of combining Claims Encounters with EMR Encounters. Leslie said some CDRNs are required to keep the Claims data separate from the EMR data thus necessitating 2 DataMarts. During the development process I found a flaw with the 2 DataMart approach. Basically, the Claims only DataMart would be missing data in several tables (see attached image) including: VITAL, CONDITION, PRO_CM, LAB_RESULT_CM, and PRESCRIBING. The data for these tables comes from the EMR, not claims. Therefore, the Claims Only DataMart would only be able to answer a subset of research questions. During the discussion, it appeared that Jeff Brown did not have a technical solution allowing him to query across the 2 DataMarts in a single query. Therefore, storing the data in one DataMart would answer more research questions. I suggested that DSSNI add some columns to the tables allowing the ETL process to describe the data provenance. The columns would include information about the type of encounter (inpatient vs outpatient) and datasource (claims vs EMR). Some of the tables (like PROCEDURES) already have these columns (ENC_TYPE and PX_TYPE). DSSNI would need to check the other tables to ensure this information. This approach effectively demotes the importance of the encounter and eliminates the need to combine encounters. There may be some other alternatives. Jeff said he would give this some consideration so we will see what happens. Chuck Borromeo From: Russ Waitman <rwait...@kumc.edu<mailto:rwait...@kumc.edu>> Date: Thursday, May 21, 2015 at 9:20 AM To: "'Campbell, James R'" <campb...@unmc.edu<mailto:campb...@unmc.edu>>, "GPC-DEV@LISTSERV.KUMC.EDU<mailto:GPC-DEV@LISTSERV.KUMC.EDU>" <GPC-DEV@LISTSERV.KUMC.EDU<mailto:GPC-DEV@LISTSERV.KUMC.EDU>>, "'McClay, James C (jmcc...@unmc.edu<mailto:jmcc...@unmc.edu>)'" <jmcc...@unmc.edu<mailto:jmcc...@unmc.edu>> Cc: Charles Borromeo <ch...@pitt.edu<mailto:ch...@pitt.edu>>, "Mandl, Kenneth (kenneth.ma...@childrens.harvard.edu<mailto:kenneth.ma...@childrens.harvard.edu>)" <kenneth.ma...@childrens.harvard.edu<mailto:kenneth.ma...@childrens.harvard.edu>>, "Shawn N. Murphy (snmur...@partners.org<mailto:snmur...@partners.org>)" <snmur...@partners.org<mailto:snmur...@partners.org>>, Rachel Hess <rachel.h...@hsc.utah.edu<mailto:rachel.h...@hsc.utah.edu>> Subject: RE: PCORI CDM V3 vote Dear Jim and GPC Dev, Thanks for the good discussion Tuesday regarding the CDM 3 vote: https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1ih4XGJVrTjIH7xOHAnQOqfqKvLl9ZxovS5PXgFO&d=BQIF-g&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=KjCrejljuWkGmPIM55bYH0tZBPf9GR4gNISVbIlT-8-RrUv_xgjzsCTceWk5rBTP&m=hDsT29R3U2BFOer5PbIm_nK9e6zrfNCvn8rWvR_cQjk&s=iZy0LL0j8VVOj16GlHa3tp7hOCyULhC9TIKIpQKFG1I&e= T7qo/edit <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1ih4XGJVrTjIH7xOHAnQOqfqKvLl9ZxovS5PXgF-0AOT7qo_edit&d=BQIF-g&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=KjCrejljuWkGmPIM55bYH0tZBPf9GR4gNISVbIlT-8-RrUv_xgjzsCTceWk5rBTP&m=hDsT29R3U2BFOer5PbIm_nK9e6zrfNCvn8rWvR_cQjk&s=rCNg5KN6VyedKIS85kYuEBZ3siIDoJlXyOM2JQp4Wsg&e= > Did we as the GPC or any other CDRNs ever propose alternatives or improved modifications to the CDM draft? If not, was it because - No opportunity - there was a sense it was futile - No interest Do we have written recommendations to improve CDM3 or specifically identify the flaws or most difficult to maintain sections? At a high level I view adding prescribing/ordered medications as good I am still concerned this non-EAV model or each domain is very expensive to augment and maintain It would be preferable to share extensible enhancements with the group as an alternative, Russ From: Campbell, James R [mailto:campb...@unmc.edu] Sent: Monday, May 18, 2015 7:10 AM To: Russ Waitman Subject: RE: PCORI CDM V3 vote On Saturday PCORI preemptively cancelled the DSSNI call scheduled for today. There has been no organized discussion of CDM for a month now. They have moved the decision making to the PIs, apparently to limit debate and need for their response. It seems they have been missing every deadline they set for themselves and I am not sure what to expect. Are you saying they have not been discussing this within the PI steering group either? Jim ________________________________________ From: Russ Waitman [rwait...@kumc.edu<mailto:rwait...@kumc.edu>] Sent: Monday, May 18, 2015 6:25 AM To: Campbell, James R Cc: Dan Connolly; McClay, James C Subject: Re: PCORI CDM V3 vote That secure transmission is the fault of the KUMC email system. No idea why it did that. I think we are all somewhat non-enthusiastic of the direction of the CDM. Do you have suggestions that would improve the next iteration? Any chance to bring those forward to Disney? Russ On May 17, 2015, at 10:23 AM, Campbell, James R <campb...@unmc.edu<mailto:campb...@unmc.edu>> wrote: Russ, Thanks for sharing the CDM V3 document with me. Why the secure transmission? I thought this was public knowledge? Have they been discussing these changes in the CDM in the PI forum? Looking through the copy that you sent me I count over 35 data attributes ADDED since our input was tendered on V3. Many of those additions do nothing to improve data quality (like all the temporary primary identifier fields we will have to generateŠ..we need to be sure they are serious that we do not have to maintain IDs across refresh cycles) and will be a lot of work for GPC data managers. I can understand hat perhaps they will be useful for the central data warehouse managers and presume that is where the requirements originated. I assume that many networks are refusing to release non-obfuscated dates without full IRB and so I appreciate the rationale for the proliferation of HARVEST.Attributes but that table will have to be regenerated for each trial report assuming that we will have a mixture of IRB approved and non-approved trials. They are giving lip service to compliance with meaningful use standardization but are adding duplicate data identification requirements (PRO_CM.PRO_ITEM; LAB_RESULT.LAB_NAME are examples) that create overhead for our data managers and require mapping tables in addition to what our sites are doing for ONC compliance. I was suprized by the appearance of the table 2.5 ³Implementation Expectations² table of page 6. Are a lot of CDRNs not able to produce LAB, CONDITION, DEATH and PRESCRIBING datasets? Will these be the factors that separate the men from the boys in trial participation? I don¹t see how they can do the ADAPTABLE trial from EHR data harvest without some of these data sets. In short, this V3 document creates a lot of new requirements for our data managers, many with apparently arbitrary specs. If we can take table 2.5 literally, GPC should be able to meet CDM compliance in the next few months but I ask if the OPTIONAL tables will not be the mark of the truly successful CDRN and therefore required for our long term viability. Please provide your prospectives on this. What is the discussion among PIs? Is the snowball already hallway down the hill? Jim NEW or REVISED ELEMENTS IN CDM V3 DIAGNOSIS.DIAGNOSISID (Unique over time for all queries to site?? They say no and so I ask WHY?) PROCEDURE.PROCEDURESID VITAL.SMOKING VITAL.TOBACCO (CHANGED FROM V2; IT APPEARS THAT THEY HAVE CREATED DUPLICATE ENTRIES FOR SMOKING BEHAVIOR AND HAVE CHANGED V2 DEFINITIONS ON TOBACCO TYPE. THIS FLIES IN THE FACE OF WHAT WE ARE BEING REQUIRED TO REPORT FOR MEANINGFUL USE ) VITAL.TOBACCO_TYPE DISPENSING.DISPENSINGID DISPENSING.PRESCRIBINGID (QUESTIONABLE ADDITION! THOSE OF OUR SITES THAT CAN REPORT THIS WILL BE ACCEPTING SURESCRIPTS DATA THAT THEY HAVE NOT ORIGINATED) DISPENSING.NDC (SHOULD SPECIFICALLY DRAW FROM NLM RXNAV PUBLICATION) [LAB_RESULT.LAB_NAME CREATES BURDEN FOR MAPPING ALL TEST NAMES IN ADDITION TO LOINC SOTH NAME WHICH SHOULD BE QUITE ADEQUATE FOR RESEARCH PURPOSES] LAB_RESULT.NORM_MODIFIER_LO LAB_RESULT.NORM_MODIFIER_HI CONDITION.CONDITIONID PRO_CM.PRO_CMID PRO_CM.PRO_ITEM (REDUNDANT WITH LOINC CODE ?WHY?) PRESCRIBING.ORDER_TIME PRESCRIBING _FREQUENCY PRESCRIBING.RX_BASIS (NEW; THIS IS INCONSISTENT WITH GUIDANCE ON NATURE OF THE DISPENSING RECORD AND MAKES NO SENSE!) PCORNET_TRIAL.PARTICIPANTID PCORNET_TRIAL.TRIALSITEID HARVEST.BIRTH_DATE_MGMT HARVEST.ENR_START_DATE_MGMT HARVEST.ENR_END_DATE_MGMT HARVEST.ADMIT_DATE_MGMT HARVEST.DISCHARGE_DATE_MGMT HARVEST.PX_DATE_MGMT HARVEST.RX_ORDER_DATE_MGMT HARVEST.RX_START_DATE_MGMT HARVEST.RX_END_DATE_MGMT HARVEST .RESULT_DATE HARVEST .MEASURE_DATE HARVEST.ONSET_DATE_MGMT HARVEST.REPORT_DATE_MGMT HARVEST.RESOLVE_DATE_MGMT HARVEST.PRO_DATE_MGMT HARVEST.REFRESH_DEMOGRAPHIC_DATE HARVEST.REFRESH_PRESCRIBING_DATE HARVEST.REFRESHPCORNET_TRIAL_DATE HARVEST.REFRESH_DEATH_DATE HARVEST.REFRESH_DEATH_CAUSE_DATE The information in this e-mail may be privileged and confidential, intended only for the use of the addressee(s) above. Any unauthorized use or disclosure of this information is prohibited. If you have received this e-mail by mistake, please delete it and immediately contact the sender. <v3_VOTE.docx> Russ Waitman, PhD Director of Medical Informatics Assistant Vice Chancellor for Enterprise Analytics Associate Professor, Department of Internal Medicine University of Kansas Medical Center, Kansas City, Kansas 913-945-7087 (office) rwait...@kumc.edu<mailto:rwait...@kumc.edu> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.kumc.edu_ea-2Dmi_&d=BQIF-g&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=KjCrejljuWkGmPIM55bYH0tZBPf9GR4gNISVbIlT-8-RrUv_xgjzsCTceWk5rBTP&m=hDsT29R3U2BFOer5PbIm_nK9e6zrfNCvn8rWvR_cQjk&s=q6FmM_cD3TTmPRKt_if_y4iaTVG2JrkLGRC5i2pLRwk&e= https://urldefense.proofpoint.com/v2/url?u=http-3A__informatics.kumc.edu&d=BQIF-g&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=KjCrejljuWkGmPIM55bYH0tZBPf9GR4gNISVbIlT-8-RrUv_xgjzsCTceWk5rBTP&m=hDsT29R3U2BFOer5PbIm_nK9e6zrfNCvn8rWvR_cQjk&s=nYusSHjDg2xK0XKWob6E1xo6f947qZLyUyJsrXs_0wU&e= <https://urldefense.proofpoint.com/v2/url?u=http-3A__informatics.kumc.edu_&d=BQIF-g&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=KjCrejljuWkGmPIM55bYH0tZBPf9GR4gNISVbIlT-8-RrUv_xgjzsCTceWk5rBTP&m=hDsT29R3U2BFOer5PbIm_nK9e6zrfNCvn8rWvR_cQjk&s=dd6qdGYvm_d4JznoKnO1IP0DnnhMHquYp5_Es59WWqc&e= > https://urldefense.proofpoint.com/v2/url?u=http-3A__informatics.gpcnetwork.org&d=BQIF-g&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=KjCrejljuWkGmPIM55bYH0tZBPf9GR4gNISVbIlT-8-RrUv_xgjzsCTceWk5rBTP&m=hDsT29R3U2BFOer5PbIm_nK9e6zrfNCvn8rWvR_cQjk&s=SUp5SRz3JBy-NdK1q17GvS7r_PBhwU54cX4VAdWaGCc&e= a PCORNet collaborative The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail.
_______________________________________________ Gpc-dev mailing list Gpc-dev@listserv.kumc.edu http://listserv.kumc.edu/mailman/listinfo/gpc-dev