blank concept codes? I'm pretty sure that's not by design. Do you mean null? 
the i2b2 fact table schema would prevent those from loading, no?

This is the expression for concept_cd; I don't see how it could ever evaluate 
to blank:


272<https://informatics.kumc.edu/work/browser/heron_load/naaccr_txform.sql#L272>
             , 'NAACCR|' || ne.ItemNbr || ':' || (
273<https://informatics.kumc.edu/work/browser/heron_load/naaccr_txform.sql#L273>
                 case when ni."Format" = 'YYYYMMDD' then null
274<https://informatics.kumc.edu/work/browser/heron_load/naaccr_txform.sql#L274>
                 else value end) as concept_cd
Oh... I think I see what you're asking.

The bulk of the data in the NAACCR file is nominal or ordinal; for example, 
Grade, which has item number 440. If the tumor grade is observed to be 1, the 
concept_cd is the item number combined with the value: NAACCR|440:1. It encodes 
the question (grade?) and the answer (1).

The exception is dates: date of diagnosis, date of last contact, etc. In that 
case, we put the value in the start_date, since that's where the i2b2 date 
constraints apply. The concept_cd for all values of date of diagnosis is 
NAACCR|390:. In this case, the concept_cd just encodes the question; we put the 
answer in the start_date column.

--
Dan


________________________________
From: gpc-dev-boun...@listserv.kumc.edu [gpc-dev-boun...@listserv.kumc.edu] on 
behalf of Lenon Patrick [ple...@uwhealth.org]
Sent: Friday, November 07, 2014 1:29 PM
To: 'Gpc-dev@listserv.kumc.edu'
Subject: Q re: NAACCR txform start date

I’m working through the official GPC 
MultiSiteDev<https://informatics.gpcnetwork.org/trac/Project/wiki/MultiSiteDev> 
code (thanks Nathan G) for transforming and loading NAACCR data and I have a 
question regarding the way the I2B2 Start_Date is derived.

In naaccr_txform.sql, the tumor_item_value view is created with the start_date 
field filled with date info from any date-formatted field, like date of 
diagnosis, date of birth, etc.  The concept code is null.

Then the tumor_reg_facts view is created from merging naaccr.extract with the 
tumor_ item_value view.  Net result is a lot of rows  in tumor_reg_facts with a 
start_date field filled in, but with blank concept codes, each row representing 
a date field from the same NAACCR tumor record.

All the fact table fields including start date appear to be filled in 
accurately, so I’m mostly wondering what happens to those records with blank 
concept codes.  The naaccr_facts_load.sql script doesn’t appear to address them.

So, do those concept-less fact table records serve some purpose, or are they 
just plucked out later?  I hope this doesn’t seem like nitpicking.  As an I2B2 
newbie I just want to make sure I’m not missing something.  Thanks.



Patrick Lenon
HIMC Informatics Specialist
608 890 5671

_______________________________________________
Gpc-dev mailing list
Gpc-dev@listserv.kumc.edu
http://listserv.kumc.edu/mailman/listinfo/gpc-dev

Reply via email to