blank concept codes? I'm pretty sure that's not by design. Do you mean null? the i2b2 fact table schema would prevent those from loading, no?
This is the expression for concept_cd; I don't see how it could ever evaluate to blank: 272<https://informatics.kumc.edu/work/browser/heron_load/naaccr_txform.sql#L272> , 'NAACCR|' || ne.ItemNbr || ':' || ( 273<https://informatics.kumc.edu/work/browser/heron_load/naaccr_txform.sql#L273> case when ni."Format" = 'YYYYMMDD' then null 274<https://informatics.kumc.edu/work/browser/heron_load/naaccr_txform.sql#L274> else value end) as concept_cd Oh... I think I see what you're asking. The bulk of the data in the NAACCR file is nominal or ordinal; for example, Grade, which has item number 440. If the tumor grade is observed to be 1, the concept_cd is the item number combined with the value: NAACCR|440:1. It encodes the question (grade?) and the answer (1). The exception is dates: date of diagnosis, date of last contact, etc. In that case, we put the value in the start_date, since that's where the i2b2 date constraints apply. The concept_cd for all values of date of diagnosis is NAACCR|390:. In this case, the concept_cd just encodes the question; we put the answer in the start_date column. -- Dan ________________________________ From: gpc-dev-boun...@listserv.kumc.edu [gpc-dev-boun...@listserv.kumc.edu] on behalf of Lenon Patrick [ple...@uwhealth.org] Sent: Friday, November 07, 2014 1:29 PM To: 'Gpc-dev@listserv.kumc.edu' Subject: Q re: NAACCR txform start date I’m working through the official GPC MultiSiteDev<https://informatics.gpcnetwork.org/trac/Project/wiki/MultiSiteDev> code (thanks Nathan G) for transforming and loading NAACCR data and I have a question regarding the way the I2B2 Start_Date is derived. In naaccr_txform.sql, the tumor_item_value view is created with the start_date field filled with date info from any date-formatted field, like date of diagnosis, date of birth, etc. The concept code is null. Then the tumor_reg_facts view is created from merging naaccr.extract with the tumor_ item_value view. Net result is a lot of rows in tumor_reg_facts with a start_date field filled in, but with blank concept codes, each row representing a date field from the same NAACCR tumor record. All the fact table fields including start date appear to be filled in accurately, so I’m mostly wondering what happens to those records with blank concept codes. The naaccr_facts_load.sql script doesn’t appear to address them. So, do those concept-less fact table records serve some purpose, or are they just plucked out later? I hope this doesn’t seem like nitpicking. As an I2B2 newbie I just want to make sure I’m not missing something. Thanks. Patrick Lenon HIMC Informatics Specialist 608 890 5671
_______________________________________________ Gpc-dev mailing list Gpc-dev@listserv.kumc.edu http://listserv.kumc.edu/mailman/listinfo/gpc-dev