Unable to get into meeting, can someone send invite from inside teh meeting?
Bonnie L. Westra, PhD, RN, FAAN, FACMI Associate Professor, University of Minnesota, School of Nursing & Institute for Health Informatics Director, Center for Nursing Informatics Location - WDH 6-155 P - 612-625-4470, Fax - 612-625-7091 email - westr...@umn.edu Mail - WDH 5-140, 308 Harvard St SE, Minneapolis, MN 55455 On Tue, Jul 28, 2015 at 9:20 AM, GPC Informatics <d...@madmode.com> wrote: > #240: Data Characterization queries for phase 1 CDM V1 compliance > -------------------------------------------------+------------------------- > Reporter: campbell | Owner: huhickman > Type: task | Status: assigned > Priority: major | Milestone: drn- > Component: data-stds | query-1 > Keywords: cdm-etl CDM Compliance data-quality | Resolution: > Blocking: | Blocked By: > -------------------------------------------------+------------------------- > > Comment (by huhickman): > > We've ran through all of the 70 queries except for two; these two as > written will never complete - I've let them run for days just to see what > will happen. These two are > > The SQL is generally quite inefficient - with exclusive use of subqueries > for correlation of data (no WITH clause to factor out common query > clauses). > > > 1. Repeatedly converting the ISO-8601 character string dates (per > CDM spec) into native date datatypes on the fly. > 2. No use of with clause to factor out subqueries (so that they get > evaluated only once). > 3. No recommended indexes in CDM V1,2, or 3. > > One of the samples that never completes : > > {{{ > #!sql > set echo off > set feedback off > SET HEAD OFF > SET PAGES 0 > SET VERIFY OFF > SET FEED OFF > SET SHOW OFF > SET TERMOUT OFF > ALTER SESSION SET PLSQL_WARNINGS='DISABLE:ALL'; > -- > ---- TotalValidHt Function - variable > create or replace function fnTotalValidHt return int is > TotalValidHt number; > begin > SELECT COUNT(ht) INTO TotalValidHt FROM Vital WHERE ht IS NOT NULL; > > return TotalValidHt; > end; > / > --vit_l3_ht_dist > SET TERMOUT ON > SELECT vit_l3_ht_dist FROM > (SELECT 0 AS position, '"Statistic","HT"' AS vit_l3_ht_dist FROM dual > UNION > SELECT 1 AS position, '"' || Statistic || '",' || CAST(ht AS > VARCHAR(15)) FROM > (SELECT 'MIN' AS Statistic, MIN(ht) AS ht FROM Vital WHERE ht IS > NOT NULL > UNION > SELECT 'MAX' AS Statistic, MAX(ht) AS ht FROM > Vital WHERE ht IS NOT NULL > UNION > SELECT 'MEAN' AS Statistic, SUM(ht)/COUNT(ht) AS > ht FROM Vital WHERE ht IS NOT NULL > UNION > -- Median is calculated by sorting all > records by ht AND getting the ht value of the midpoint of all records > SELECT 'MEDIAN' AS Statistic, d.ht AS ht FROM > (SELECT ROW_NUMBER() OVER (ORDER BY ht desc) AS > rowno, ht FROM Vital WHERE ht IS NOT NULL > ) d > WHERE d.rowno = CAST(fnTotalValidHt / 2 AS INT) > AND ht IS NOT NULL > UNION > -- Records with missing patid are counted here for completeness. > SELECT 'NULL or missing' AS Statistic, COUNT(*) AS > ht FROM Vital WHERE ht IS NULL > UNION > SELECT 'record_n' AS Statistic, COUNT(*) AS ht > FROM Vital WHERE ht IS NOT NULL > )csv > ORDER BY position,vit_l3_ht_dist > )m; > exit > > }}} > > The other one that does not complete is the same query except for weight. > > In the above example using Oracle's MEDIAN function will help cure some of > the ills along with factoring out some of the subselects. I will rewrite > and send feedback for these two in particular. > > -- > Ticket URL: < > http://informatics.gpcnetwork.org/trac/Project/ticket/240#comment:20> > gpc-informatics <http://informatics.gpcnetwork.org/> > Greater Plains Network - Informatics > _______________________________________________ > Gpc-dev mailing list > Gpc-dev@listserv.kumc.edu > http://listserv.kumc.edu/mailman/listinfo/gpc-dev >
_______________________________________________ Gpc-dev mailing list Gpc-dev@listserv.kumc.edu http://listserv.kumc.edu/mailman/listinfo/gpc-dev