Re: R, SQL, i2b2, and governance RE: Example (Re: Empirical Data Dictionary)

Alex Bokov Tue, 07 Oct 2014 07:42:14 -0700

Yes, I'm fully in agreement there-- no direct, unlimited queries byinvestigators. Or under normal circumstances, us, for that matter. Theonly variable parts are the patient/visit sets.

I see the goal of the initial cohort work is learning how to generalizethese queries so they can be automated and run without exposing the enduser of the data to unnecessary internals. My actual cohort query is alot less broad than the Empirical Data Dictionary. It will basicallytake a I2B2-generated patient/visit set and do the non-oracle-specificequivalent of a pivot on year for all of them. It will also return atop-N by prevalence list of leaf concepts for that patient-set orvisit-set.

The logic is similar whether it's implemented in R or one's favoriteflavor of SQL (even if the actual syntax is mind-meltingly different).

What is meant by audited query? The code for the generic query has beenreviewed by a trusted party? Or the individual instance of each query isreviewed manually before being approved? Either sounds like a good idea.



On 10/06/2014 05:42 PM, Dan Connolly wrote:

(Please excuse the awkward top-posting format; I'm stuck withMicrosoft Outlook.)
Perhaps we're converging... the new Data Builder code<https://informatics.gpcnetwork.org/trac/Project/ticket/134#comment:4>delivers an sqlite3 file, so you can continue to use SQL to analyzeit; and if you like python or Java better than R for post-SQL work,that's fine too.
But note that each Data Builder result is based on an i2b2 patient setthat came from and *audited i2b2 query*. We don't have governance tolet investigators run arbitrary SQL queries on our whole clinical datawarehouse and we don't plan to (neither KUMC HERON nor GPC). For the 3initial cohorts, we can get away with ad-hoc one-off work, but for GPCwork in general, we plan do use i2b2 to do as much of the querying aswe can.

_______________________________________________
Gpc-dev mailing list
Gpc-dev@listserv.kumc.edu
http://listserv.kumc.edu/mailman/listinfo/gpc-dev

Re: R, SQL, i2b2, and governance RE: Example (Re: Empirical Data Dictionary)

Reply via email to